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Abstract 

A cellular automaton (CA) is one of the powerful and 
popular decentralized computing models with a wide 
variety of applications. The CA consists of an array 
of similar cells that interact with each another in a 
neighbourhood relationship and have definite state. 
The clustering and classifier models of CA are very 
popular in deriving significant knowledge from the 
large volumes of data set. The present paper presents 
a cellular automata clustering using morphological 
reconstruction (CACMR) on brain MRI images. The 
proposed morphological reconstruction operator 
segments accurately the skull and the brain as well as 
the background to accurately identify the tumour. For 
clustering the tumour the present paper uses an 
efficient cellular automata method based on Moore 
neighbourhood. The proposed CACMR is tested on 
brain MRI collected from various websites, and the 
various statistical measures that indicate the 
significance and accuracy of the proposed method. 

Keywords:Reconstruction, segmentation, MRI, 
Moore neighbourhood, statistical measures. 

1. Introduction 

The decentralized computing model that is capable of 
providing an excellent platform for performing 
complex computation is the cellular automata (CA). 
The CA provides this with the help of only local 
information. The CA paradigm of local information, 
decentralized control and universal computation for 
modelling different applications are well established 
and exploited by the researchers, scientists and 
practitioners from various fields for the last decades. 
The reason behind the popularity of cellular automata 
can be traced to their simplicity, and to the enormous 
potential they hold in modelling complex systems, in 
spite of their simplicity. Cellular automata (CA) are 
arithmetical brands [4] and non-linear dynamical 
systems. In CA space and time are distinct and are 
termed as cellular, because they are made up of cells. 
Thesecells are treated as points in the lattice or square 
of the checker boards and are referred to as ‘automata’ 



[12]. The CAinclude a vast number of reasonably 
trouble-free individual units, or “cells”. Each cell, 
in turn, is a straightforward predetermined 
automation which continually refreshes its own 
status, whereas the fresh cell state is dependent on 
the existing state of the cell and its immediate 
(local) neighbours [11,16] . CAs have been engaged 
extensively to assess intricate systems in environment 
[7], based on the above gleaming qualities. The 
provision of proficient and high-speed techniques 
without infringing the limitations of the data stream 
atmosphere has emerged as the most important 
challenge. This has ultimately and inevitably led to 
the origin of the procedure of data mining in cellular 
automata [14]. The data mining methods with CA are 
highly essential in several application domains such 
as online photo and video streaming services, 
economic analysis, concurrent manufacturing process 
control, search engines, spam filters, security, and 
medical services [12]. The impressive nature of the 
Cellular automata can be characterized as follows: [2]. 
1. A cellular automaton is discrete time space. 2. 
Each and every cell comprises a number of restricted 
states. 3. The entire cells are located in the identical 
location. 4. Each and every cell is rationalized 
simultaneously. 5. The regulation in each and every 
locality is dependent on the value of the locality 
around its neighbours 6. The regulation for fresh 
value of each and every locality is also based on 
value of restricted number of preceding conditions. 
Two basic parameters characterize the CA. These two 
parameters are referred as number of states denoted 
by kand the second one specifies a radius of its 
neighbourhood [19] denoted as r. Each cell of CA 
vector endowed with 2 states and 1 radius. The 
relative regulations are known as transition rules [9]. 
The cellular automata methods in data mining 
applications [20] encapsulates input data d from a 
definite time interval of inspection, and transition 
rules are derived based on the application of data 
mining methods [3]. CAs are capable with several 
merits for modelling, together with their 
decentralized method, straightforward to the intricacy 
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rule, the association of form with task and model with 
procedure, the comparative easiness in visualizing the 
model outcomes, their elasticity, their vibrant 
technique, and also their kinship with geographical 
data systems and remotely sensed data. Of all, the 
most noteworthy quality is its effortlessness [1]. 

Today a very good quality of data is generated by 
various applications such as high-speed networking, 
finance logs, sensor networks, and web tracking [13]. 
The enormous amount of data thus collected from 
various sources is developed as an unrestricted data 
sequence arriving at the port of the system [17]. This 
leads to a problem of how to deal such a gigantic data 
stream [15]. There is a feast of methods designed for 
the mining of such items or models from data streams 
[8] using CA. Javier de Lope et al. [22] have 
jubilantly advocated a data clustering algorithm 
founded on the concept of deeming the individual 
data items as cells forming part of a one-dimensional 
cellular automaton. It integrates the insights into both 
social segregation brands rooted on cellular automata 
theory, where the data items themselves have the 
capability to travel freely in lattices, and also from 
ants clustering algorithms. Liliana Perez et al. [23] 
have legitimately launched the incorporation of the 
ABM with CA technique to successfully tackle 
modelling at both fine and large spatial scales. 
The distinct nature of CA facilitates incorporation 
with raster-based geospatial datasets in GIS, and 
is also advantageous during the course of modeling. 
A straightforward technique of edge recognition in 
accordance with the cellular automata by means of 
using digital images also proposed in the literature 
[24]. The recognition process is generally applicable 
to both monochromatic and colour images. Haijun 
Wang et al. [25] have heftily launched a cloud-based 
CA to characterize ambiguity proliferation. 
Mathematical morphology [33] is an image 
transformation technique that locally modifies 
geometric features through set operations. It is a 
powerful tool with various applications, such as 
nonlinear image filtering, noise suppression, 
smoothing, shape recognition, shape reconstruction 
[28, 29], skeletonization [5, 6, 10, 21], texture 
segmentation [27], classification purpose [30] and 
medical image processing [31, 32]; and it is 
becoming very common in image processing. There 
are three prerequisites for the fuller realization of the 
potential of morphology: i. Complex processing 
combining various morphological operations 
(including other operations, such as discrete-time 
cellular neural networks, linear filtering , and area 
calculation) ii. Processing with large and complex 
structuring elements iii. High-speed (real-time) 
processing. 

The present paper is organized as follows. The 
section 2 describes about proposed methodology, 
section 3 describes about experimental results and 
discussions and section 4 gives conclusions. 



2. Methodology 

The proposed method basically consists of two 
phases/modules. In the first phase skull striping is 
performed. For this the present paper used 
morphological reconstruction, which segments the 
skull in a better way than the existing methods, since 
it over comes the noise effect, smoothens the image 
and segregate the background pixels. In the second 
phase to cluster the region of interest cellular 
automata is used. By this brain tumours can be 
detected easily and accurately. The accuracy of the 
clustered image is measured by using statistical 
measures. The beauty of the proposed method is it 
applied morphological opening by reconstruction for 
segmenting of magnetic resonance image (MRI) of 
the brain. MRI is characterized for its high spatial 
resolution and soft tissue contrast. These two 
important characteristics make MRI as one of the 
most useful, significant and important imaging 
modalities in the diagnosis of brain related 
pathologies. The proposed morphological 
reconstruction by opening segments, as accurately as 
possible, the skull and the brain as well as 
background. In the proposed method elimination of 
skull from MRI is performed by reconstruction. The 
OTSU threshold on this enables the region of interest. 
Finally the proposed cellular automata clustering 
based on Moore neighborhood enables the region of 
interest in the MRI. 

2.1 Proposed morphological reconstruction 

The present paper used a new variant of 
morphological reconstruction method. 

Morphological reconstruction is very little-known 
method for extracting significant information about 
shapes in an image. The shapes could be just about 
anything: Skull of a MRI, letters in a scanned text 
document, fluorescently stained galaxies in a far- 
infrared telescope image etc... One can use 
morphological reconstruction in several applications 
and some of them are listed here. I) to extract marked 
objects ii) distinguish and remove objects that are 
touching the image border iii) filter out spurious high 
or low points iv) locate bright regions surrounded by 
shady or darker pixels v) distinguish or fill in object 
holes. There are two types of morphological 
reconstructions one is morphological Opening by 
Reconstruction and second one is morphological 
closing by reconstruction. 

2.2 Opening by reconstruction 

Opening is defined as erosion followed by dilation 
with a given structuring element on the image. The 
structuring element plays a vital role in this operation. 
In morphological opening, little or small objects that 
are smaller than the structuring element are typically 
removed by erosion, and the successive dilation tends 
to restore or ‘ regrows ’ the remaining objects by the 
same shape. The main problem with opening is the 
accurateness of this restoration depends on the 
similarity between the structuring element and the 
shape. The proposed morphological opening by 
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reconstruction method of this paper overcomes the 
above disadvantage of the simple opening. The 
proposed morphological opening by reconstruction 
preserves or restores the basic or original shapes of 
the objects after erosion. 

Two images and a structuring element (instead of a 
single image and structuring element) are needed to 
perform a morphological reconstruction 
transformation. The first image, is the marker image, 
is the starting point for the transformation. The image 
that constrains the transformation is the mask image 
(second image). The structuring element defines 
connectivity between these. The proposed method 
used the original MRI as the mask image and the 
eroded MRI as the marker image. The proposed 
Morphological reconstruction filters enable the 
complete extraction of the marked objects by 
preserving the edges of the MRI. These filters 
preserve the contours. And also no new regional 
extrema are created by these reconstruction 
transformations on MRI. The fundamental 
mathematical morphology operations dilation and 
erosion are given in equation 1 and 2. 

Dilation^ D(A,B) = A® B = (Jp 6B (A + B) (1) 

Erosion = E(A,B) = AQ (-£) = \Jp eB (A - /?) 

( 2 ) 

Where A is original binary image and B is structure 
element. 

The dilation operation fills the gaps or holes in the 
image and erosion widens the gaps. 

The opening of image A by structuring element B, 
denoted, is defined as: 

A o B = (A@B) ©B (3) 

Thus, the opening A by B is the erosion of A by B, 
followed by a dilation of the result by B. 

The closing of image A by structuring 
element B, denoted A • B, is defined as : 

A • B = (A ®B) © B (4) 

The closing of a by B is simply the dilation of a by B, 
followed by the erosion of the result by B. 

The structuring element B used in this paper is: 
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2.3 Clustering using cellular automata 

The fundamental unit of cellular automata consists of 
a grid of cells and each cell has a finite number of 
states with finite number of dimensions. There are 
two type of neighborhood in CA; the first type is the 
von Neumann neighborhood in which each cell has 
four neighbors. The second one is the Moore 



neighborhood and it contains the eight cells i.e.it 
includes diagonal cells surrounded by a focal cell on 
a two-dimensional square grid. And it is named after 
Edward F. Moore, an initiator and pioneer of cellular 
automata hypothesis. Moore neighborhood is well 
utilized for the well-known Conway's Game of Life, 
which is based on the idea of 8-joined pixels in 
workstation representation. The idea might be 
stretched out to higher measurements. Therefore in 
the Moore neighborhood, the connected component 
(CC) in a tessellation (TT) is 8. 

The proposed cellular clustering algorithm based on 
Moore neighborhood is given below. 

1. Obtain the resultant image from morphological 
reconstruction. 

2. The boundary pixel is that pixel that differs with 
the current pixel. 

3. Create a vector (V) and set as V = null 

4. Examine the tessellation (TT) in all the direction to 
find whether any black pixel (Bp) (i.e. pixel value =0) 
exists in the tessellation (TT). 

5. If any black pixel (Bp) exists in the tessellation, 
add it in to the vector (V). 

6. Assign the found out black pixel (Bp) as the 
current boundary point (bp) and the pixel from which 
the black pixel (Bp) entered while the examining 
process is assigned as the neighborhood of the current 
boundary pixel (np). 

7. Set the current pixel (cp) as the next clockwise 
pixel from the neighborhood of the boundary pixel 
(np). 

8. Check whether the current pixel (cp) is equal to the 
black pixel (Bp) which is already detected. 

9. If cp is a black pixel, then add it to the vector V 
and go to step 1 1 . 

10. If it is not then go to step 4. 

1 1 . Assign the neighborhood of the current boundary 
point (np) as the new boundary point (i.e, bp=np) . 

12. Set the current pixel as the new neighborhood (i.e, 
np=cp). 

13. Assign the current pixel (cp) as the next 
clockwise pixel from the neighborhood of the 
boundary pixel (np). 

14. If cp is not a black pixel, then assign the current 
pixel (cp) as the next clockwise pixel from the 
neighborhood of the boundary pixel (np). 

15. Update the neighborhood. 

16. Continue this process until the current boundary 
pixel is equal to the boundary pixel for the second 
time. 

Finally the clustered image is obtained. The 
termination condition given in step 1 6 restricts the set 
of shapes the calculation will walk totally. 

3. Experimental results and discussion 

The proposed CACMR is experimented on a large 
scale (1000) of brain MRI, collected from various 
websites. The Figure 1(a) shows the four original 
MRI’s, the Figure 1(b), 1(c) and 1(d) shows opening 
by reconstruction, image after threshold optimization 



Graphics, Vision and Image Processing, V. 15, No. 2, ISSN 1687-398X, Delaware, USA, December 2015 



and-clustered image using cellular automata 
respectively for the corresponding original MRI’s of 
Figure 1(a). 





(c) 




(d) 

Figure 1: (a) Input image (b) Opening by reconstruction, (c) Image 
after threshold optimization and (d) Clustered image using cellular 
automata. 

3.1 Performance Analysis 

To evaluate the performance of the proposed 
CACMR accurately, the present paper evaluated 
statistical measures of the performance of a medical 
test such as sensitivity, specificity and accuracy. 
These parameters are widely quoted in statistics as a 
classification function. Sensitivity is also called as the 
true positive rate or the recall rate. The sensitivity 
parameter evaluates or measures the proportion of 
genuine or actual positives which are correctly 
identified. This clearly indicates the sensitivity is 
complementary to the false negative rate. Specificity 
(sometimes called the true negative rate) measures 
the proportion of negatives which are correctly 
identified as such (e.g., the percentage of healthy 
people who are correctly identified as not having the 
condition), and is complementary to the false positive 
rate. The average rate of the above parameters is 
plotted in the Figure 2. The proposed CACMR 
method outperforms the existing methods. 



4 
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Figure 4: Performance of our proposed method and the 
existing technique in terms of Precision and Recall. 
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Figure 2: Performance measures comparison in terms of Accuracy, 
Sensitivity and specificity for the proposed technique and existing 
technique. 

The present paper also evaluated false acceptance 
ratio (FAR) and false rejection ratio (FRR) on the 
above MRI by using the proposed and the existing 
methods and the results are displayed in the Figure 3. 
FAR and FRR are obtained by subtracting sensitivity 
and specificity values from one. Low value of FAR 
and FRR indicates better performance. 



0.4 



0.2 





Figure 3: Performance measures comparison in terms of FAR 
and FRR for the proposed technique and existing technique. 



The present paper also evaluated other performance 
measures like precision and recall to prove the 
efficacy. The Figure 4 indicates the average precision 
and recall rates. The average precision rate and recall 
of the proposed method is higher and lower 
respectively than the existing methods. 



Figure 5: Performance comparison in terms of DB Index, 
Dunn Index, Jaccard Index, Rand Index and F -Measure for the 
proposed technique and existing technique. 



3.2 Clustering Performance Evaluation by 
Benchmark Functions 

To evaluate the performance of the proposed method 
the present paper evaluated the accurate internal 
evaluation schemes for clustering methods namely 
Davies-Bouldin index, Dunn Index, Jacard index, 
Rand index [26] and F-measure on the proposed 
CACMR and the other existing techniques. These 
internal evaluation schemes clearly specify how 
accurately the clustering has been done using the 
proposed quantities and feature of the derived model 
on the dataset. The clustering algorithm that produces 
a collection of clusters with the smallest Davies- 
Bouldin index is considered the best algorithm. High 
Dunn index value is more desirable for a clustering 
algorithm. The Jaccard index takes on a value 
between 0 and 1. An index of 1 means that the two 
datasets are identical and an index of 0 indicate that 
the datasets have no common elements. The Rand 
index is also viewed as a measure of the percentage 
of correct decisions made by the algorithm. F- 
measure combines both the precision and recall 
values into a single value as in figure 5. 
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When compared to the exiting technique, the 
proposed CACMR technique has low value of DB 
index, high values of Dunn index, Jaccard index, 
Rand index and F-measure which denotes that the 
proposed clustering technique is performs better than 
the existing technique. 



4. Conclusion 

The proposed cellular automata clustering efficiently 
used morphological reconstruction for segmenting the 
brain MRI. The main advantage of the proposed 
segmentation is to segment, as accurately as possible, 
the skull and the brain as well as the background. To 
eliminate the skull from the image, an opening by 
reconstruction of size 3 x 3 is applied. Some of the 
existing methods fail in segmenting if a thin 
connection exists between skull and brain. The 
proposed morphological reconstruction overcomes 
this. Also the performance of the proposed CACMR 
optimization technique is estimated using benchmark 
functions and the outcome proves that the proposed 
CACMR technique convergences faster than the 
existing technique. 
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Abstract 

Iris Recognition under non ideal imaging conditions 
like eyelash occlusions, rotation of eye and Charge 
Coupled Device (CCD) noise etc., is a challenging 
problem that sought the attention of researchers. 
In the proposed method heterogeneous features are 
extracted using 2D Discrete Orthonormal Stockwell 
Transform (DOST) and rotation invariant Local Bi- 
nary Patterns (LBP). The DOST provides frequency 
information where as the LBP provides the spatial 
textural information. The feature set size is reduced 
based on the entropy of the features, which are 
used to train and test the Support Vector Machines 
(SVM) for checking the classification accuracy. The 
verification performance of the proposed scheme is 
validated using the benchmark databases and the % 
Correct Recognition Rate (CRR) is found to be more 
than (99%) under non ideal imaging conditions also. 

Keywords .-Biometrics, Iris recognition, S Trans- 
form, LBP, Statistical Moments, SVMs. 

1 Introduction 

Iris Recognition is considered to be the popular and 
sophisticated biometric technique. Iris is the annular 
portion between dark pupil and white sclera in the 
human eye which is an internal organ yet externally 
visible and has an extraordinary texture that is unique 
for each individual. The contraction and dilation of 
the iris muscles control the amount of light entering 
the eye. The textural features like freckles, ridges, 
corona and collarette region are same throughout the 
life time and do not change at all. Iris biometric is 
well accepted by the public as the imaging technique 
is non invasive and gained the attention of researchers 
as the iris patterns are stable and unique. Several in- 
ternational airports have established the iris scan sys- 
tems to identify their passengers and facilitate quick 
processing. The artefacts in iris images such as poor 
brightness, low contrast, motion blur, specular reflec- 



tions, eye rotation, direction of gaze, pupil contraction 
and dilation, eyelid and eyelash occlusions increase the 
false negative rate. The decision environment for iris 
recognition is influenced by many unfavourable condi- 
tions which are shown in Figure. 2 such as non ideal 
imaging conditions, rotation of eye and direction of 
gaze, CCD noise and particularly eyelash and eyelid 
occlusions which was established in [4] and [5]. 

Whilst majority of the researchers focussed on pre- 
processing of iris region like segmentation and noise 
reduction, new trends have been introduced in the 
area of processing and recognizing nonideal iris im- 
ages. The comparison of existing iris recognition al- 
gorithms is given in Table. 1 and a comprehensive sur- 
vey can be found in [2] . The relationship between the 
within-class variability and between-class variability 
is the core issue in pattern recognition. The degrees 
of freedom of classes of pattern determine it. The 
separation or overlap among pattern classes affect the 
decidability index of pattern recognition algorithm. 
Preferably the within-class variability should be small 
and between-class variability large. 

To achieve this, the feature selection plays a crucial 
role. Most iris recognition algorithms use either multi 
resolution transforms like Gabor or spatial domain 
features for iris feature extraction. A new feature ex- 
traction algorithm based on heterogeneous features, a 
combination of 2D DOST proposed in [6], which is a 
multi resolution transform and rotation invariant LBP 
proposed in [9] which is spatial domain technique, is 
presented in this paper that gives classification rates 
more than 99% when trained and tested using SVMs. 

The iris region is first segmented and then un- 
wrapped. After normalization of the iris region, the 
textural features are extracted using 2D DOST which 
decomposes the image in to different frequencies with 
bandwidths increasing dyadically. The rotational in- 
variant LBP features along with DOST coefficients 
form the feature set pool as shown in the Figure. 1. 
Then this feature vectors in the pool are subjected 
to entropy based reduction which are then classified 
using SVMs. SVMs are being successfully used for 
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Figure 1: Iris Feature Extraction Algorithm 



classification as they discriminate one class from the 
rest by a hyper plane with a great margin. The pro- 
posed method is validated using UBIRIS and CASIA, 
WVU and IITD databases. 

The rest of the paper is organized as follows. In 
Section 2 the segmentation and normalization of iris 
region are explained. The heterogeneous textural fea- 
ture collection and feature vector generation using 
DOST and LBP and feature set reduction based on 
entropy are presented in Section 3. The classification 
using SVMs is described in Section 4. In Section 5 
the experimental results, comparisons and discussions 
are illustrated and in Section 6, the conclusions and 
scope of future work are presented. 

1.1 Related Work 

A detailed survey on iris recognition algorithms was 
presented in [2]. Zero crossing method with dissimi- 
larity functions for matching was employed by Boles. 
The iris textural data were extracted using 2D Haar 
transform in [7]. Multi channel Gabor filtering for 
the extraction of iris features was used in [8] . Hilbert 
transform for iris feature extraction was used by Tisse. 
Levelsets were applied for segmentation and wavelet 
features were extracted in [10]. Circular Hough trans- 
form with dynamic range determination technique for 
improved segmentation of iris and pupil boundaries 
was proposed by Basma in [1]. A segmentation tech- 
nique based on morphological processing which accu- 
rately detects the pupil boundaries was presented by 
Uma in [14]. Legendre moments were applied to cap- 
ture the iris features corrupted by non ideal imaging 
conditions and CCD noise in [13]. Eyelash occlusions 
were eliminated using Monro Iris Transform algorithm 
in [17] in which the occluded pixels were filtered and 
the texture recovery was done. 

In the proposed paper both 2D DOST coefficients 
and rotational invariant LBP features are extracted, 
their first order statistical moments are computed and 
then a compact feature vector set is built upon en- 



tropy based ranking. These feature vectors are then 
trained and tested using SVMs for checking the clas- 
sification performance. 



2 Iris Segmentation and Nor- 
malization 

The circular iris and pupil regions are located by using 
Daugman’s integro- differential operator. The dimen- 
sional inconsistencies amongst iris images are caused 
by contraction and dilation of pupil resulting from 
varying illumination levels during image acquisition. 
Also, the pupil region is not exactly concentric in the 
iris region. So for these reasons, the iris region is to 
be changed from circular to cartesian form of fixed 
dimensions, in order to facilitate comparison. These 
normalized iris images are enhanced using high boost 
filtering such that the edge details are emphasized as 
well as contrast is improved. The integro-differential 
operator for segmentation and rubber sheet model for 
normalization are given in [3]. 



3 Heterogeneous Feature Ex- 
traction using 2D DOST, LBP 
and Statistical Moments 



The S-Transform named after the inventor Stockwell 
proposed in [12] suits well for the classification of 
noisy textures. It is proved that the S-Transform is 
more immune to noise than Wavelet transforms. The 
2D DOST of an image is calculated in frequency do- 
main using a dyadic sampling scheme. The 2D Fourier 
Transform of a discrete signal f(x,y) in x and y di- 
rections is given by 

M — 1 N—l 

F[m,n]fe£ £/[*,»] e- 2 - ( ^+^ } (l) 

x=0 y = 0 

and the 2D inverse Fourier transform is given by 

M/2-1 N/ 2 — 1 

/^] = MV X S F[m,n]e 27rz( T + ^) 

m=—M/2 n=—N/2 

( 2 ) 

The 2D DOST of an N x N image is computed by 
splitting the 2D FT of the image, F[m, n], then mul- 
tiplying with square root of the number of points in 
the block and then computing the inverse 2D-FT 



S[x',y',v x ,v y \ = 



2 px-^_i 2 p y~ 2 — l 

£ £ 



x/OPx+Py-Z ' ( o\ 

VZ m=—2Px~ 2 n= —2 p y~ 2 (4) 

F[m + v x , n + v y \e 2lTl ^ 2Px ~ 1 + 2 ' Py 



The Fourier spectrum is divided such that the wave 
numbers (v XQ ,v yo ) are shifted to zero wave number 
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Table 1: Comparison of Proposed Method with Existing Iris Recognition Methods 



Author 


Segmentation 

Technique 


Feature Set Nature 


Matching Process 


Quality Evaluation 


Daugman [3], [4], [5] 


Integro-differential 

operator 


Binary feature vec- 
tor using 2D Gabor 
filters 


Hamming distance 


Good recognition rate 


Wildes et al. [16] 


Image intensity 

gradient and 

Hough transform 


Laplacian pyramid 
ID signature 


Normalized corre- 
lation 


Matching process time 
consuming. Suitable 

for identification but 
not for recognition 


Ma et al. [8] 


Canny edge detec- 
tion and Hough 
transform 


ID real valued fea- 
ture vector using 
wavelets 


Weighted Eu- 

clidean distance 


Local features used for 
iris recognition 


Schuckers et al. 

in] 


Integro-differential 
operator and an- 
gular deformation 
method 


ICA and Bi orthog- 
onal wavelets 


Hamming distance 


Enhanced performance 
on non ideal data bases 


Vatsa et al. [15] 


Mumford shaw 

functional 


ID gabor filter for 
textural features 
and Euler number 
for topological 

features 


Iris indexing algo- 
rithm 


Good identification 

rate on non ideal iris 
databases 


Roy and Bhat- 
tacharya [10] 


Levelset methods 


Wavelet features 


Adaptive Asym- 

metrical SVMs 


High recognition rate 
with non ideal datasets 


P.V.L. Suvarchala 
et al. [13] 


Integro-differential 

operator 


Exact Legendre 

moments 


SVMs 


High recognition rate 
in non ideal imaging 
conditions and CCD 
noise 


Zhang et al. [17] 




Monro Iris Trans- 
form 


Weighted Ham- 

ming distance 


Eyelash occlusions are 
removed using non lin- 
ear filtering 


Proposed 

Method 


Integro- 

differential 

operator 


Heterogeneous 
features using 

2D DOST and 
LBP 


Support Vector 
Machines 


Improved % CRR 
due to compact fea- 
ture set 




Figure 2: Iris images from CASIA, UBIRIS, WVU and IITD databases with specular reflections, pupil dilation, 
rotation of eye, eyelash occlusions, CCD noise, eyelid occlusion along with pupil dilation making very little 
active iris area 
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point and a 2 Px —lx 2 Py — 1 point inverse Fast Fourier 
Transform is applied which results in a rectangular im- 
age block of size 2 Px — lx 2 Py — 1. The total number of 
points in the 2D DOST image and the original image 
are the same. 

The 2D DOST inverse can be found by applying 
forward 2D FFT to each image block to reverse the 
spectral partitioning such that the spectrum of the 
image is reconstructed. 

2 Px ~ 2 -1 

F[m,n] = y/2 Px+p y~ 2 

m=—2Px~ 2 

S[m -v x ,n- v y ] e - 2,ri( ^T' + 5 ^i ) 

Now the original image can be reconstructed by ap- 
plying the inverse Fourier transform as follows: 



2 y y ~ 2 — l 

E 

n =-2Py~ 2 



(4) 



N/ 2-1 N/2—1 

fl x ,y] = jp x F[m,n] 

m=—N/2 n=—N/2 

(5) 

The DOST and Discrete Wavelet Transform (DWT) 
both use a dyadic sampling scheme with the orders of 
0, 1, 2, 4, ...., logn — 1. How ever they provide dissimi- 
lar information regarding the frequency content in the 
image. The DWT generates horizontal, vertical and 
diagonal detail coefficients for each order of sampling 
while the DOST provides the frequencies (v x , v y ) with 
a bandwidth of ( 2 Px — 1 , 2 Py — 1 ). 

DOST separates an image in to horizontal and ver- 
tical frequencies with different bandwidths. It can 
provide spatial frequency representation while main- 
taining the phase properties of the Fourier Transform. 
The additional feature of 2D DOST is, it can give pixel 
by pixel texture description of the image by gener- 
ating the local spectrum that contains the horizontal 
and vertical frequency content from the Fourier Trans- 
form of the image given by [6]. 

The local spatial frequency description at a sin- 
gle pixel or patch of pixels can be calculated from 
S[x',y',v x ,v y \ for all (v x ,v y ). 

The local spectrum extraction algorithm proposed 
is as follows: 



1. Resize the iris template I(x,y) to 16 x 256 and 
normalize the intensity values to be between 0,1; 

2. Divide the iris image into 16, 16 x 16 subimages 

Iuh, ^ he; 

3. Let horizontal and vertical voice frequencies be 
v x = 16 and v y - 256; 

4. Create and initialize local spectrum matrix of 
size log 2 (v x ) x log 2 (v y ) to contain zeros; 



5. Create and initialize feature vector FV to con- 
tain zeros; 

6. forall sub images of Ii(x,y) to Iio(x,y) do 

7. forall v x in 1 to log 2 (v x ) : and v y in 1 to log 2 (v y ) 

do 

8. S[x,y,v x ,v y ] <— Compute 2D DOST; 

9. end 

10. Compute I order statistical moments for the lo- 
cal spectrum S[x,y,v x ,v y ] ; 

11. FV statmoments; 

12. end 



3.1 Rotation Invariant LBP Features 

LBP is an operator based on description of the signs of 
differences between the neighboring pixels in the im- 
age. It is immune to monotonic changes in the gray 
levels of the image. Gray scale and rotation inavari- 
ant LBP were introduced in [9] which characterize the 
local texture of the image and also the contrast. The 
gray scale invariant LBP in 3 x 3 neighborhood of a 
pixel is achieved using the expression as follows: 



9 

LBPg = ^ s(gi — go) 2 i ~ 1 (6) 

i= 1 



where, 



s(x) 



1 if x > 0 
0 if x < 0 



(7) 



and go is the center pixel gray level value and gi, i = 
1, 2, ...8 are the gray level values of eight surrounding 
pixels of circularly symmetric neighborhood as shown 
in Figure. 3. Rotational invariance of LBP was de- 
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Figure 3: Circularly symmetric 3 x 3neighborhood 

veloped based on the assumption that, out of possible 
2 8 = 256 patterns only 36 patterns are unique and 
in these also only 9 have uniformity measure (U) at 
most 2, which corresponds to spatial transitions in the 




12 




Graphics, Vision and Image Processing, V. 15, No. 2, ISSN 1687-398X, Delaware, USA, December 2015 



patterns (0/1 or 1/0). Such patterns are designated 
as uniform. Hence the rotational invariant LBP is 
expressed as 

lbp hu 2 = f Ell s(9i - go) if U(LBPs) < 2 
1 9 otherwise 

(8) 

The LBP extraction algorithm proposed is as follows: 

1. Resize the iris template I(x,y) to 16 x 256; 

2. Divide the iris image into 16, 16 x 16 subimages 

? 

3. Create and initialize lbpmat of size of subimage 
to contain zeros; 

4. Create and initialize feature vector FV to con- 
tain zeros; 

5. forall sub images of Ii(x, y) with i in 1 to 16 do 

6. lbpmat <— Compute rotational invariant 
LBP {Ii(x, y)}; 

7. Compute I order statistical moments for the lbp- 
mat; 

8. FV st at moments; 

9. end 

3.2 Statistical Moments 

If Qi be the i th local DOST spectrum or local binary 
pattern consisting of MxN coefficients, the first order 
statistical moments, mean, standard deviation, skew 
and kurtosis are calculated for each block as follows: 
The first moment about origin is known as mean (/^) 
which gives the average value of the local spectrum or 
pattern coefficients in the i th region: 

1 M N 

mean(fii) = — E E I Qi (x l,x2) (9) 

xl=l x2=l 

The square root of second central moment is 
known as standard deviation (cq) is the measure of 
variability of the coefficients’ values about the mean 
Hi- 



std(ai) 



M 



N 



\ 



MN 



E E {I Qi (xl,x2)-m) (10) 



xl=l x2=l 



The third central moment is known as skew, 
which describes how symmetric the coefficients are 
about its mean. 



skew(si) 



1 M N 
* x 1=1 x2=l 

(if) 



The fourth central moment is known as kurtosis 
which is the measure of flatness of the coefficient val- 
ues in the region 

1 M N 

kurtosis(ki) = E E i I n i {xl,x2) - Mi) 4 - 3 

^ x 1 — 1 x 2 — 1 

( 12 ) 

After finding out the local spectrum DOST coeffi- 
cients, as well as rotation invariant LBP of each block 
in the iris template the first order moments are com- 
puted and appended to form the feature vector. At 
the end of each feature vector the corresponding class 
label is attached, facilitating for the supervised learn- 
ing and testing. 

3.3 Entropy Based Feature Set Reduc- 
tion 

Before classification of the heterogeneous features of 
the iris images, ranking is given to the features ac- 
cording to their entropy. The entropy of a pattern 
which is in an order is lower than the entropy of the 
pattern with disorder. An irrelevant feature or signa- 
ture has more randomness, which means uncertainty 
and hence higher entropy than a relevant signature. 
The feature with least entropy is given first rank and 
so on. The entropy is computed as follows. 

N N 

Entropy = — EE dij x log (dij)+ 

i=l j = l 

(1 ~dij) x log(l -d^) 

where d^j is the distance between features of N differ- 
ent samples in the same class. The features that have 
least entropy are more significant for classification. 
Roy in [10] proposed genetic algorithm based feature 
set reduction where as entropy based feature set re- 
duction is computationally less complex than that and 
efficient. 



4 Iris Classification with SVMs 

In iris recognition many researchers applied Ham- 
ming distance for pattern matching [2]. Conventional 
SVMs and Non symmetrical SVMs were employed 
by a few researchers to distinguish the false positive 
and false negative cases and Asymmetrical Adaptive 
SVMs were used in [10] to cut the matching times of 
the test samples. 

SVMs have been an excellent tool for data classifi- 
cation. The essential idea is to map the data points in 
to a high dimensional space and separating them by 
a hyper plane with highest margin. In the proposed 
method the heterogeneous feature vectors are trained 
and tested using traditional SVMs with Linear kernel 
at different train versus test ratios. 
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Table 2: Result on CASIA database - Comparison of EER(xlO 3 ) 



Method/ 

Parameter 


Daugman 


Tan 


Monro 


Roy 


Proposed 

Method 


EER 


6.91 


3.13 


0.26 


0.22 


0.18 



Table 3: EER of Proposed Method on Different Iris Databases at Train : Test Ratio of 70% : 30% 



Iris Database 


DOST 

Features- 

EER(xlO -3 ) 


LBP Features- 
EER(xlO -3 ) 


Heterogeneous 

Features- 

EER(xlO -3 ) 


CASIA- 1 


0.22 


0.28 


0.18 


CASIA-Interval 


0.96 


1.2 


0.82 


UBIRIS-1 


0.186 


0.198 


0.142 


IITD 


0.96 


0.98 


0.86 


WVU 


1.2 


1.44 


0.98 



4.1 Classification Evaluation 

The classification of iris feature vectors is car- 
ried out using osusvm package available at 
http: / /kaz. dl.sourceforge.net / project / svm/svm/3.00 / 
osusvm-3.0.zip. The proposed method is tested at 
70% : 30% TraimTest ratio with DOST features, 
rotation invariant LBP features independently as well 
as heterogeneous features using CASIA- 1, CASIA- 
Interval, UBIRIS-1, WVU and IITD databases. The 
multi class SVMs operated in One-against-All ap- 
proach are tuned with linear support vector classifier 
kernel. The performance of the proposed method is 
appraised in terms of Equal Error Rate (EER) and 
% CRR and compared with existing methods. 

EER is the point at which the False Positive 
Rate (FPR) is equal to the False Negative Rate 
(FNR). 

% CRR is defined as: 

VCRR Correctly Reeogni^d Users Number 
Total Number of Users Enrolled 

(14) 



5 Results of Experiments 

The experiments are conducted on a PC with Pentium 
i3, 2 GHz processor and 2 GB RAM in MATLAB 
7.10 environment. The evaluation of the proposed 
method is done on CASIA version- 1, CASIA Iris In- 
terval, UBIRIS version- 1, WVU and IITD databases. 
The proposed method is checked on original as well 
as synthetic datasets of CASIA- 1 and UBIRIS-1 gen- 
erated under nonideal imaging conditions and CCD 
noise as described in [13]. 

CASIA version- 1 and Iris Interval databases avail- 
able at http://nlpr-web.ia.ac.cn/english/irds/ iris- 
database, ht ml, comprise of 756 iris images taken in 
two sessions from 108 classes and 249 classes with left 



and right eyes respectively. Each iris image is of 8-bit 
gray scale and a resolution of 320 x 280. The UBIRIS 
version-1 database available at http://iris.di.ubi.pt, 
adds 1205 iris images from 241 subjects with 5 sam- 
ples from each subject with a resolution of 150 x 200. 
The IITD iris database contributes 2220 images of 5 
samples from left and right eyes of each class with a 
resolution of 320 x 240. The WVU offers a database 
of size 1852 images from 380 different people. 

CASIA and IITD databases are of varying image 
quality, suffering from eyelash, eyelid occlusions where 
as UBIRIS database suffers from blur, specular reflec- 
tions and partial iris also. WVU dataset images suffer 
from eye rotation and gaze rotation. Hence the four 
databases are preferred for the experiment as they 
contain the conditions suitable to evaluate the per- 
formance of the proposed work. 

5.1 Discussion on Results 

The iris region is segmented and normalized to obtain 
a 20 x 240 size iris template. The pre processing step 
is performed before extraction of DOST and LBP fea- 
tures, to enhance the contrast and highlight the edges 
in the image by applying high boost filtering. The 
spatio frequency domain and spatial domain features 
from all sub blocks are extracted and the coefficients 
are organized into a row forming a feature vector of 
size 128. After performing the step of entropy based 
ranking on the features, only 56 features with low en- 
tropy are considered for classification. Then feature 
vectors are classified using SVMs tuned with linear 
kernel. 

It is found that the proposed work is compared 
with existing methods in terms of EER in Table 2. 
and the EER is very less compared to all. The het- 
erogeneous feature set followed by entropy based rank- 
ing is of small size (56). The comparison of EER 
and % CRR is shown in Tables 3 and 4. Both EER 
and % CRR are improved in all datasets when both 
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Table 4: % CRR of Proposed Method on Different Iris Databases at Train : Test Ratio of 70% : 30% 



Iris Database 


DOST 

Features 

(%CRR) 


LBP 

Features 

(%CRR) 


Heterogeneous 

Features 

(%CRR) 


CASIA- 1 


98.85 


98.45 


100 


CASIA-Interval 


97.4 


97 


99 


UBIRIS-1 


99.42 


98.78 


100 


IITD 


98.8 


98.2 


99.81 


WVU 


98.2 


97.6 


99.2 




Train versus Test Ratio --> 



DOST and rotation invariant LBP features are com- 
bined rather than individually. 

The robustness of proposed method is also evalu- 
ated on synthetic datasets of CASIA-1 and UBIRIS-1 
in which, all non ideal imaging conditions like camera 
blur, low brightness, poor contrast and CCD noise 
are simulated as proposed in [13]. The evaluation is 
carried out at different train versus test ratios under 
low, medium and high variance of CCD noise and non 
ideal imaging conditions. The % CRR result on the 
synthetic datasets is shown in Figures. 4 and 5. It 
is observed that the proposed heterogeneous feature 
extraction method performed well under highly noisy 
conditions also. 



6 Conclusion 



Figure 4: % CRR result on CASIA synthetic dataset 
- Non ideal imaging conditions and CCD noise simu- 
lated as mentioned in [13] 




A new feature extraction method using 2D DOST and 
rotation invariant LBP and statistical moments which 
is followed by entropy based ranking is proposed in 
this paper and it is proved that the performance of the 
iris recognition method is improved over the existing 
methods. It is proved from the results that, when the 
combined spatio frequency domain (DOST) features 
and spatial (rotation invariant LBP) domain features 
offer high recognition rates than independently. The 
heterogeneous features proved to be good on synthetic 
datasets also which suffer from non ideal imaging con- 
ditions and CCD noise. The feature vector size is very 
small (56) due to ranking and there is a scope for fur- 
ther reduction in size of the feature vector with feature 
set reduction techniques. 
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Abstract 

In this paper, we propose a single dictionary learning 
algorithm to fully make use of only high-resolution 
images. Unlike the other methods, dictionary is trained 
from the set of high-resolution image patches of size 5x5, 
7x7 and 9x9 instead of patch pairs of high-/low- 
resolution images. The advantage of this modification is 
no need to train the dictionary again when up-scaling is 
changed. There is a run-time improvement is achieved 
with best quality of reconstructed image due to single 
dictionary. The simulation results justify that the 
proposed method accomplishes the state-of-the-art results 
compared to other super-resolution methods in terms of 
both reconstruction ability and with shorter run-time. The 
demonstration results for single image super-resolution 
are more promising. 

Keywords: Bicubic Interpolation, Non Local Means, 
Super-resolution, Sparse Representation, Sparse Coding, 
Single Dictionary. 

1. Introduction 

High-resolution images are useful in many applications, 
like in medical image diagnosis, remote sensing, video 
surveillance and satellite imaging [1]. But due to the 
technological and economic constraints, there is a restrain 
for availability of high-resolution images. Compared to 
other methods, super-resolution method is one of the best 
method for reconstructing high-resolution images from 
corresponding low-resolution images. In recent year, 
super-resolution reconstruction is an active area in image 
processing field which can help to overcome the 
resolution limitations of low-cost imaging sensors. 

“Single image super-resolution” deals with the generating 
high-resolution (HR) image from single low-resolution 
(LR) image of the same scene. The most challenging task 
is to describe the relation between low- and high- 
resolution images [2], because over-complete sparse 
model are well developed to natural images and highly 
robust to the noise [3], based on learned dictionary, 
sparse representation has achieved best performance on 
image denoising and restoration [4]. 



Super-resolution task is the inverse problem which can 
be extremely ill-posed in nature. Because of many high- 
resolution images generated low-resolution images from 
same scene. In order to stabilize this ill-posed problem, 
regularization is an important procedure [5-7]. Super- 
resolution can be broadly classified into three major 
categories: interpolation, reconstruction and learning 
based super-resolution algorithm. Interpolation based 
super-resolution algorithm [8-10] are simple that 
generates smooth images and tend to blur high frequency 
details. In learning-based super-resolution algorithms [8], 
[11-13], detailed textures are explicated by seeking 
through a training set of low- and high-resolution images. 
They need a proper choice of the training images, 
otherwise inaccurate information may be found. 
Alternatively, availability of low-resolution and up- 
scaling factor is restricted in reconstruction-based super- 
resolution algorithms [14-19], apply constraints to the 
high-resolution images based on priori should generate 
original low-resolution images. Yang et al. in [12] 
developed sparse representation model to successfully 
reconstruct the high-resolution image patch. From a 
single input image, this method can reproduce both 
textures and precise edges. However, sparse 
representation is effectively used in many fields, like 
image compression, image denoising [20], face 
recognition [21] etc. The dictionary takes weighty role in 
sparse representation. Many dictionary learning methods 
has been developed to learn a dictionary from example 
patches [22-30]. 

The major limitation of these joint dictionary methods is 
that when the image is up-scaled, the dictionary is 
retrained. In this paper, we present a simple and efficient 
single image super-resolution algorithm to solve above 
problem. The advantages of this algorithm: 

(1) The dictionary is trained from the high- 
resolution image patches only and hence, 
retraining of a new dictionary is not important 
when scaling factor changed. In addition, the 
training time is also reduces. 

(2) Definitely run-time also reduces. We combine 
the both sparse representation patch-based 
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method and reconstruction-based method into a 
unified energy-function framework. 

In our setting, we will work with only single dictionary 
training for high-resolution image patches. 

This paper is organized as follows: Section (2) describes 
the mathematical background of related work, Section (3) 
represents the single dictionary learning and proposed 
algorithm, Section (4) shows experimental results and 
analysis of quality parameter and Section (5) concludes 
the paper. 



2. Mathematical Background [31] 

2.1 Non-Local Means 

In image processing, the best algorithm for image 
denoising is Non-local means (NLM) filtering [32] which 
takes a mean of all pixels in the image, weighted by how 
similar these pixels are to the target pixel. This method 
has become more famous in many applications such as 
de-interlacing and view interpolation. This can be 
considered as a best kind of learning-based hallucination 
methods because training samples are originated from 
image itself in this method. Basically, NLM is a weighted 



filter which is formulated by 






pq 



x ij = 






I 



w 4 



pq 



(p>q)ePij 



(i) 



Where Y denotes the intensity of the low-resolution 



image Y at position (p , q) and x /; denotes the intensity of 
high-resolution image X at position (/, /). /L is the index 
set including coordination of similar pixel with pixel (/, j). 
The weight denotes the similarity between the 

patches R^Y (the patch centered at (i, j ) in image Y) and 
RpqY , which can be calculated by 






wr = exp 



PT- 






I 2 ^ 
I G 



(2) 



J 



where h play a role as a global smoothing parameter, 
parameter G is a kernel matrix that assigns the pixels 
close to the center of the patch to contribute more within 
target image patch and R L is an operator originating a 

patch centered at 

2.2 Sparse Representation 

In signal processing, sparse representation refers as the 
sparse linear combination of small number of known or 
unknown “basis vectors”. Basically sparse representation 
comprises of two things: (1) dictionary training and (2) 
sparse coding. Dictionary training is a new approach to 

learn a basis matrix D pxk (every column a ^-dimensional 
basis vector). The main goal of the training methods 
leads to simple formulation of / 0 and ^ sparity measures. 

Sparse coding is referred as the problem of finding sparse 
representation with a small number of significant 
coefficients. For sparse coding, consider signal x of 



dimension p can be sparsely represented with respect to a 
given dictionary D. This problem can be formulated by 
solving sparsest coefficient vector a as: 

min|a| = Da , (3) 

The single dictionary training process will be described 
in Section 3. In this setting, we use generic images for 
training purpose. 



3. Single Dictionary Learning and Proposed 
Algorithm 

This section will discuss the training of single dictionary. 
The dictionary is prepared only by set of high-resolution 
images. 

In this technique, high-resolution image ‘X’ is 
reconstructed with low-resolution image ‘Y\ In this 
proposed algorithm, three constraints are assumed to 
reconstruct high-resolution image X: 

(1) Fidelity constrain: The reconstructed high- 

resolution image X should be consistent with 
the input low-resolution image Y according to 
image degradation model, 

Y = SBX + Tf, (4) 

Where, Y is an observed low-resolution image 
which is down-sampled and blurred version of X, S 
denotes down-sampling operator, B denotes 
unknown blurring operator and Yj denotes noise item. 



(2) Sparsity constrain: With respect to assumption 
of sparsity described in last section, 
reconstructed high-resolution image X can be 
spasrely represented with respect to unknown 

dictionary D pxk . 



A{«, y } = argminaJ 

D,{a tJ } 

stlRyX-Da^^Sy, 



( 5 ) 



Where a tj is a representation coefficient for 
patch RyX and denotes error item. 

(3) Non-Local Mean constrain: This term shows 
that the one pixel in X could be estimated by a 
weighted average of the same pixel found in X. 
Non-local means regularization term for 
individual pixel x /; can be formulated as 



X-. = argmin 






( 6 ) 



Where x /; denotes vector consists of similar pixels 
found in X and w /; - is the corresponding weight vector. 



A unified energy-function framework is produced by 
combining all these three constraints together which can 
be written as: 
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D, X , {a ;> . } = arg min||55X - 

D,X,{ aiJ } 



+ 



M N 



M N 



ZZ^hL+ZZw-^-G, 



i = 1 7=1 



*=1 y=l 



( 7 ) 



| l J l J || 2 V’ 



Where A and J3 tj are the constants for all patches 
which represents as a regularization parameter. 



entity. After successfully training dictionary, equation (8) 
degraded into: 

II 2 



X, [ay } = argmin|£fiX - Y\f 



+ 



M N 



M N 



ZZ^h.+ZZ^F.-^’A 

i = 1 j = 1 i=l y=l 



( 9 ) 






Figure 3. shows the block diagram of proposed work. 



3.1 Single Dictionary Training 

Figure 1. shows the collection of images used for training 
dictionary. 




Figure 1. Set of training images 

Figure 2. shows the flow of single dictionary training 
diagram using multiple generic images. 



HR image set 

HR Dictionary 




Figure 2. Single dictionary training using multiple generic images 



Usually, dictionary is learned from the collection of 
training images X = {jq x 2 x t } . Sometimes, it is 
very difficult to learn a compact dictionary. 

Joint dictionary training is extremely time consuming. In 
order to solve this problem, we learn single dictionary 
training. In this training, dictionary is trained off-line 
from high-resolution patches random sampled from 
training images. 

Procedure for dictionary training is formulated as follows: 

D,{a k } = argminJJ-P* ~ Da kf 

D ’{a k ) k 2 (g) 

•sHlaJL ^ L\/k , 

II K 110 

Where k is an index number of sampled patches, 
a k denotes the sparse representation coefficient for k th 
patch and L represents the maximal number of non-zero 

j®i 



3.2 ALGORITHM 



1: Given input low-resolution image Y. Call only high- 
resolution dictionary D. 

2: By using Bicubic Interpolation method, find out X 0 
which can be represents as initial estimation of image X. 

3: For image X, we modify equation (9) as: 



kl= 



M N 



arg min V V /L Mar,., II 

X-u l J || V || o 

i=\ 7=1 



S^RyX-Dayl^Sy, 



(10) 



This process is called as the sparse coding process. 

To solve it, we use dynamic group sparsity which has a 
better improvable over conventional sparse coding. 

4: Again for updated coefficient, equation (9) can be 
modifying as: 

X = argmm\\SBX-Y\\ 2 2 + 

M N 9 

ZZaIK-’F- g «L (11) 

z=l 7=1 



S^RyX-Dayll^Sy, 

In the right hand side, second term can be converted in 
matrix form, 

P^I -W)X\\, (12) 

Where 



W(r,s) 



w? q ,if(p,q) is an element of G /; - 
0 else 



(13) 



Here, r, s are the coordinates of (p, q) and (i, j). For next 
third term, we convert it into global form which term as 
similarity item. 

\\X-Zf 2 <s, (14) 

Where Z is the reconstructed image derived from only 
representation coefficient a tJ . as: 

Z = argminX|kZ-Dor..|[, (15) 

z Uj 

This problem has a least-squares solution which can be 
given by, 
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z = 



2XX 



-1 



YtfDa , 

jL-j l J l J 



(16) 



After calculating W and Z, we again modify equation (11) 
as: 



X = argmin|££X -yf 2 + t\\ (/ - W)X \ 2 + 

X 

el* -41 



(17) 



Where T and C, are regularization factor to control the 
influence of the three parts. But equation (17) is not 
convex in Z. By using Gradient Descent approach, we 
can easy to minimize this optimization problem to 
generate high-resolution image X. 

5: Go to Step 3 until coverage. 



Trained Dictionary 




Output 



Figure 3. Flow of proposed work 




Figure 4. Zoom image, Bicubic Interpolation, Sparse Recovery for 
Lena image. 

Table. 1 and Figure 4, indicates the PSNR performance of 
different methods for 2X, 4X and 6 X magnifications for 
a Lena image. From table. 1, it is illustrated that compare 
to other methods, sparse recovery is best method which 
gives superior results. 



Table.2. PSNR Performance for Different Methods (Face) 



Upscale 


PSNR(dB) Face 5*5 


Zoom 

Image 


Bicubic 

Interpolation 


Sparse 

Recovery 


2 


32.22 


32.96 


34.12 


4 


32.16 


32.72 


34.51 


6 


32.03 


33.74 


34.17 



4. Experimental Results And Analysis 

In this paper, 2 X magnifications are carried out for input 
low-resolution images. After completion of these 
experiments, it is concluded that 5x5 is best patch size. 
Therefore, we demonstrate experiment on 5x5 patch size. 
For color images, human eyes are more sensitive to the 
luminance channel. Hence, we applied proposed 
algorithm to the luminance channel only. 

Here, we trained only single dictionary for high- 
resolution image patches randomly sampled at rate of 
1,00,000 from 69 high-resolution images in the training 
set. To estimate the performance and computation, we fix 
dictionary of size 1024 and sparse regularization factor (X) 
=0.15 throughout the experiment. 

To evaluate the performance of different methods, we use 
performance parameter such as Peak signal to noise ratio 
(PSNR), Structural similarity (SSIM) and Image quality 
index (IQI). 

We use Intel(R) Core(TM) i5-3470 CPU@3.20GHz 
machine with 4GB RAM for simulation. 




(C) (d) 



Table.l. PSNR Performance for Different Methods (Lena) 



Upscale 


PSNR(dB) Lena 5*5 


Zoom 

Image 


Bicubic 

Interpolation 


Sparse 

Recovery 


2 


30.98 


32.79 


34.29 


4 


30.49 


32.92 


33.51 


6 


30.06 


32.35 


32.60 



Figure 5. Results of the Face with 2X magnification. 

(a) low-resolution (LR) input image, 

(b) Zoom Image (PSNR = 32.22dB), 

(c) Bicubic Interpolation (PSNR = 32.96dB), 

(d) Sparse Recovery for 5x5 patch (PSNR = 34.12dB). 
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(c) (d) 



Figure 6. Results of the Face with 4X magnification. 

(a) low-resolution (LR) input image, 

(b) Zoom Image (PSNR = 32. 1 6dB), 

(c) Bicubic Interpolation (PSNR = 32.72dB), 

(d) Sparse Recovery for 5x5 patch (PSNR = 34.5 ldB). 




(c) (d) 



Figure 7. Results of the Face with 6X magnification. 

(a) low-resolution (LR) input image, 

(b) Zoom Image (PSNR = 32.03dB), 

(c) Bicubic Interpolation (PSNR = 33.74dB), 

(d) Sparse Recovery for 5x5 patch (PSNR = 34.17dB). 



Table 2 and Figures 5, 6, 7, shows more demonstration 
results of Face image of different methods with 2X, 4X 
and 6X magnification factor. From figure 5, 6 and 7, it is 
cleared that sparse recovery is best method which gives 
best quality results. 

From the previous analysis, it is clear that sparse 
recovery is the best among the other. In this algorithm 
also we consider 5x5, 7x7 and 9x9 strips. Figures 8, 9 
and 10, shows the sparse recovery results along with 
zoom image and bicubic interpolation method. It is clear 



from the Figures 8, 9 and 10, that sparse recovery for 9x9 
strips is the best. But the run-time is very high. 




(e) (0 



Figure 8. Results of the Lena with 2X magnification. 

(a) low-resolution (LR) input image, 

(b) Zoom Image (PSNR = 30.98dB), 

(c) Bicubic Interpolation (PSNR = 32.79dB), 

(d) Sparse Recovery for 5x5 patch (PSNR = 34.29dB), 

(e) Sparse Recovery for 7x7 patch (PSNR = 34.38dB), 

(f) Sparse Recovery for 9x9 patch (PSNR = 34.39dB). 

Tables 3, 4 and 5, shows more demonstration results of 
different quality and quantity parameter such as PSNR, 
SSIM and IQI values for 5x5, 7x7 and 9x9 patch sizes 
for different test images. From tables 3, 4 and 5, it is 
cleared that as patch size increases, PSNR value 
increases little bit but estimated run time also increases. 
But by considering run time as a main controlling factor 
with acceptable image quality, 5x5 patch results are more 
dominating compare to other. 
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(e) (f) 

Figure 9. Results of the Butterfly with 2X magnification. 

(a) low-resolution (LR) input image, 

(b) Zoom Image (PSNR = 28.30dB), 

(c) Bicubic Interpolation (PSNR =33.1 5dB), 

(d) Sparse Recovery for 5x5 patch (PSNR = 36.59dB), 

(e) Sparse Recovery for 7x7 patch (PSNR = 36.60dB), 

(f) Sparse Recovery for 9x9 patch (PSNR = 36.45dB). 




(e) (0 



Figure 10. Results of the Parrot with 2X magnification. 

(a) low-resolution (LR) input image, 

(b) Zoom Image (PSNR = 32.94dB), 

(c) Bicubic Interpolation (PSNR = 34.92dB), 

(d) Sparse Recovery for 5x5 patch (PSNR = 35.87dB), 

(e) Sparse Recovery for 7x7 patch (PSNR = 35.98dB), 

(f) Sparse Recovery for 9x9 patch (PSNR = 36.00dB). 



PSNR PERFORMANCE 

40 / 



35 




■ 5x5 patch 
PSNR 
■7x7 patch 
PSNR 
9x9 patch 
PSNR 



Figure 11. PSNR Performance for patch size 5x5, 7x7 and 9x9 for 
different test images. 




Figure 12. Time Estimation for patch size 5x5, 7x7 and 9x9 for 
different test images. 



Figure 11 and 12, shows the graphical representation of 
PSNR and run-time. Hence, we consider optimum patch 
size is 5x5 which provides best super-resolution results 
and take efficient time. 

5. Conclusion 

In this paper, a simple unified framework for single 
image super-resolution using single dictionary is 
developed. In single dictionary training, dictionary is 
trained for high-resolution image patches instead of 
high/low-resolution image patch pairs. Advantage of 
single dictionary training is no need to train new 
dictionary again when up-scaling is changed. Another 
advantage is it reduces the run-time. This dictionary is 
also called as high-resolution dictionary. From table no. 
3, 4 and 5, it is clear that 5x5 patch is the best with 
acceptable image quality. Using this algorithm, super- 
resolution results are having rich quality and precise 
edges. From the results, it is found that single dictionary 
has done this task more accurately. 
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Table.3. Performances of Different Methods for Different Test Images for Patch Size 5x5 



Test images 


Zoom method 


Bicubic Interpolation 


Sparse recovery 


Time(sec) 


PSNR 


SSIM 


IQI 


PSNR 


SSIM 


IQI 


PSNR 


SSIM 


IQI 


Lena 




■‘J; 

Haci : - 

u 




30.98 


0.89 


0.79 


32.79 


0.91 


0.82 


34.29 


0.93 


0.85 


12.876736 


Butterfly 


3 


_ 

L 


28.3 


0.92 


0.93 


33.15 


0.97 


0.97 


36.59 


0.98 


0.98 


10.485493 


Face 






32.22 


0.84 


0.75 


32.96 


0.85 


0.76 


34.12 


0.88 


0.80 


5.916809 


Parrot 








32.94 


0.92 


0.77 


34.92 


0.94 


0.8 


35.87 


0.95 


0.81 


23.553663 


Rosella- 

bird 








28.55 


0.9 


0.86 


29.58 


0.92 


0.89 


30.16 


0.93 


0.89 


15.802999 


Spaghetti 








26.87 


0.98 


0.81 


28.95 


0.98 


0.84 


29.85 


0.98 


0.87 


31.365603 



Table.4. Performances of Different Methods for Different Test Images for Patch Size 7x7 



Test images 


Zoom method 


Bicubic Interpolation 


Sparse recovery 


Time(sec) 


PSNR 


SSIM 


IQI 


PSNR 


SSIM 


IQI 


PSNR 


SSIM 


IQI 


Lena 




m. 


i 


30.98 


0.89 


0.79 


32.79 


0.91 


0.82 


34.38 


0.93 


0.85 


17.639275 


Butterfly 


j 


SOTS 


1 


28.3 


0.92 


0.93 


33.15 


0.97 


0.97 


36.6 


0.98 


0.98 


14.358423 


Face 








32.22 


0.84 


0.75 


32.96 


0.85 


0.76 


34.12 


0.88 


0.8 


7.966472 


Parrot 






32.94 


0.92 


0.77 


34.92 


0.94 


0.8 


35.98 


0.95 


0.81 


32.601400 


Rosella- 

bird 




1 




28.55 


0.9 


0.86 


29.58 


0.92 


0.89 


30.11 


0.93 


0.89 


22.012821 


Spaghetti 




L 


26.87 


0.98 


0.81 


28.95 


0.98 


0.84 


29.89 


0.98 


0.87 


43.989943 



Table. 5. Performances of Different Methods for Different Test Images for Patch Size 9x9 



Test 




Zoom method 


Bicubic Interpolation 


Sparse recovery 


Time(sec) 


images 


PSNR 


SSIM 


IQI 


PSNR 


SSIM 


IQI 


PSNR 


SSIM 


IQI 


Lena 




i 


30.98 


0.89 


0.79 


32.79 


0.91 


0.82 


34.39 


0.93 


0.85 


44.819674 


Butterfly 


] 


1 


28.3 


0.92 


0.93 


33.15 


0.97 


0.97 


36.45 


0.98 


0.98 


35.484671 


Face 




cl 


32.22 


0.84 


0.75 


32.96 


0.85 


0.76 


34.08 


0.88 


0.80 


19.118581 


Parrot 




*1 


32.94 


0.92 


0.77 


34.92 


0.94 


0.8 


36 


0.95 


0.81 


82.564971 


Rosella- 

bird 




li- 


28.55 


0.9 


0.86 


29.58 


0.92 


0.89 


30.17 


0.93 


0.89 


55.361026 


Spaghetti 




gl~ 


26.87 


0.98 


0.81 


28.95 


0.98 


0.84 


29.88 


0.98 


0.87 


110.617177 
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Abstract 

The limited availability of labeled training samples for 
supervised hyperspectral image classification decrease 
the accuracy of supervised classifier. To overcome this 
difficulty, this paper introduces a new supervised 
classification approach for remotely sensed hyperspectral 
image data which process accurately with limited number 
of training samples and integrates the spectral and spatial 
information via composite kernel. In fact, the developed 
method introduces, in the learning step, new examples 
oversampled from the available limited training set by 
using interpolation techniques. First, each pixel will be 
presented by two vectors: spectral vector containing all 
the spectral information and spatial vector including 
contextual information extracted using Extended Multi- 
attribute profiles (EMAP). Then, an interpolation 
technique is used to generate new training samples, using 
the limited available data. Finally, a support vector 
machines (SVMs) with composite kernel is efficiently 
trained to generate the classification map. The proposed 
classification approach is evaluated using both simulated 
and real hyperspectral data sets, allowing higher 
performance when compared with the classification 
without oversampling. The integration of interpolation 
methods with SVMs, combined with the use of spectral 
and spatial information, represents a contribution in the 
literature. This approach is shown to provide accurate 
classification of hyperspectral imagery with limited 
number of training samples. 



Keywords: Hyperspectral images, SVM, composite 
kernels, interpolation techniques. 



Nomenclature: 

AA 

AP 

CFS 

EAP 



Average Accuracy 
Attribute Profile 
Correlation-based Feature 
Selection 

Extended Attribute Profile 



EMAP 


Extended Multi-Attribute 


ICA 


Profile 

Independent Components 


k 


Analyses 
Kappa coefficient 


Lp 


Linear Projection 


mRMR 


Minimum-Redundancy- 


OA 


Maximum-Relevance 
Overall Accuracy 


PCA 


Principal Components 


SVM 

SVM-RFE 


Analyses 

Support Vector Machine 


C={1,...,C} 


Recursive Feature Elimination 
set of C classes 


P = {l,-,n} 


Set of integer indexing the n 


Y = 


pixels of an image. 
Image with n pixels 


(yi>-,y n ) 


yspect _ 


Spectral features: each pixel is 


^ySpect ySpect ^ ^ j^dxn 


characterized by a d- 

dimensional spectral vector. 


yspat _ 


Spatial features: each pixel is 


( y r,.., y r)eR” 


characterized by a m- 
dimensional spatial vector. 


Lab = (lab 1? ...,lab n ) 


Labels of n pixels. 


yspect 


Set of t spectral vectors of 


available labeled samples in 


{(tyr )>•.., (i,yT)} 


class i. 


rpispat 


Set of t spatial vectors of 


available labeled samples in 




class i. 
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x~ = 


„ X r‘)eRdx< 


Set of real presenting the 
abscissas of t spectral training 


(xr,. 


samples in class i. 


xr = 


,x‘7')6R mxt 


Set of real presenting the 
abscissas of t spatial training 


(xr,.. 


samples in class i. 


yspect 




g generated spectral vectors for 


i new 


_ ; ySpec,) eR dx g 


class i used to train the spectral 


(y r,. 


kernel. 


yspat _ 




g generated spatial vectors in 


(yr*,- 


>y spa.) eR m *g 


class i required for the learning 
of the spatial kernel. 


j^spect 




spectral kernel 


j^spat 




spatial kernel 


f 




Interpolation function. 


F 




Weight 0<p<l 



1. Introduction 

Supervised classification of high dimensionality remotely 
sensed hyperspectral images that requires prior 
knowledge is a challenging task [1]. Hughes phenomenon 
manifests as the data dimensionality increases. This 
problem refers to the unbalance between the high number 
of available spectral bands and the limited availability of 
labeled training samples used for the learning step of the 
classifier. In order to overcome this problem, two 
strategies have been conducted: either by reducing the 
dimensionality of images or by increasing the number of 
training samples. For the first strategy, several features 
selection [2] and extraction methods have been combined 
with machine learning techniques including SVMs. In [3], 
a significant increase of the classification accuracy has 
been resulted according to the combination of SVMs with 
four features selection methods which are SVM-RFE, 
CFS, mRMR and Random Forest. ICA has been 
investigated in [4] to reduce the dimensionality of the 
hyperspectral image. In [5], the non supervised features 
selection method Lp has been combined with SVM to 
improve the performance of the classification. For the 
second strategy, various semisupervised approaches have 
been presented in the literature. Indeed, the increase of 
the number of training examples presents a challenge 
because the collection of labeled data is generally 
difficult, expensive and time-consuming. For that, semi- 
supervised learning techniques [6] have been adopted in 
hyperspectral image classification. These approaches are 
based on the exploitation of both training data and 
unlabeled patterns. In [7], active queries have been used 
in the development of a semiautomatic procedure to 
generate land cover maps from remote sensing images. A 
significant improvement has been achieved according to 
the exploitation of active learning algorithms in the 
development of semi-supervised self-learning approach 
presented in [8]. 

In this work, we propose another solution to have 
accurate classification with small training set by 
increasing the number of labeled samples without using 
semi-supervised learning techniques. The main idea of 



the proposed method is the oversampling of the available 
learning samples to generate new training data. In this 
context, interpolation techniques can provide competitive 
advantages by creating new samples over curve that 
interpolates the existing labeled examples. This idea is 
not exploited yet in the literature to perform the accuracy 
of the supervised hyperspectral image classification. 
Combining spectral and spatial information is a recent 
trend that proved successfully results in several recent 
remote sensing hyperspectral images classification 
studies. Hence, many spectral-spatial classification 
approach have been presented such as composite kernels 
presented in [9], the generalized composite kernel 
framework proposed in [10], the SVM ensemble 
approach combining spectral, structural, and semantic 
features developed in [11], the introduce of spectral and 
different spatial features via the composition of kernels 
in [12] and the investigation of segmentation methods in 
[13]. In particular, composite kernels have shown high 
performance in term of accuracy and computational time. 
In this paper, we propose a new supervised hyperspectral 
image classification approach which overcomes the 
problem of the limited number of training samples by 
generating new training samples and combines spectral 
and spatial information via composite kernels. The 
algorithm implements the following three main steps: 1) 
spectral and spatial characterization, where each pixel is 
presented by spectral and spatial vectors, 2) 
oversampling which generate new training samples by 
means of interpolation methods, and 3) classification that 
results the classification map built by SVMs with 
composite kernels. The main novelty of our proposed 
work is the integration of interpolation techniques to 
generate from a small training set an important number 
of new learning examples, which will be shown to 
increase the size of labeled patterns and to provide a 
good classification results. 

The remainder of this paper is organized as follows. 
Section 2 formulates the problem. Section 3 describes the 
proposed approach. Section 4 reports classification 
results based on simulated and real hyperspectral data 
sets. Finally, Section 5 concludes with some remarks. 

2. Problem formulation 

Supervised classification aims at assigning a label labi E 
C to each pixel yi by using a set of labeled samples 
called training samples required to train the classifier. 
This process results in an image of class labels Lab. 
Supervised classifiers SVMs are particularly applied in 
the remote sensing field due to their ability to 
successfully manipulate small training data sets [14, 15]. 
However, in the case of hyperspectral images various 
researches [1] proved that the internal class variability 
(various spectral signatures in the same class) and the 
high dimensionality of the data require an important 
number of training samples to have accurate 
classification. In fact, SVM aims to find a hyperplane 
that separates the dataset into predefined number of 
classes based on training examples [16]. The optimal 
separation hyperplane refers to the decision boundary 
that minimizes misclassifications, obtained in training 
step. Then, if the number of learning examples is limited, 
the classifier cannot find the optimal separation 
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hyperplane producing thus an important rate of 
misclassifications. 

In the other hand, the collection of learning samples 
involves expensive ground campaign that conduct to 
have a small set of labeled examples. To overcome this 
limitation and increase the size of training set, 
interpolation techniques can be exploited to generate 
synthetic data using the limited number of available 
samples. Indeed, new samples can be created under 
curves that interpolate the t points presenting the set of 
training samples. 

Interpolation process consists to construct a function f(x) 
that passes through k points: (xi, f(xi)),..., (xk, f(xk)). 
Various interpolation methods have been proposed in 
literature such as: 

• Linear interpolation: For each interval (xi, Xi+i), the 
interpolation formula is given as Equation (1): 



spectral and spatial samples through a limited number of 
available labeled data in the goal of increasing the size of 
learning set indispensable to train the two different 
kernels K spect and K spat and results, in the second step, the 
final classification map will be generated according to 
the use of SVMs with composite kernels. In our 
proposed algorithm, the spectral information is 
represented by Y spect which is trained by a spectral 
kernel. The spatial information is presented by Y spat 
which is processed by a spatial kernel. In each class i, 

generated spectral information Y { s ^ is resulted from 
the interpolation of {( x ^,y^ ct ),...,(xf t ect ,y^ t )} 

and the set of new spatial samples Y^^ is generated 
from the interpolation of 

{(x 1 T,y;rx--(x 1 r,y 1 r)}. 



fix) = Af (x ; ) + Bf(x M ) ( 1 ) 

where 

A = X *' +1 ~ X , B = l-A= X ~ Xi (2) 

x. + i-x. x i+l -x t 



3. Proposed approach 

In this section, we present, first, spectral and spatial 
characterization techniques used to extract features. 
Second, we propose new algorithm which uses 
interpolation methods to generate new learning samples. 
Finally, we integrate the spectral and the spatial 
information via SVM and composite kernels. 



• Lagrange interpolation: In this case, the interpolation 
polynomial is given as a linear combination of 
Lagrange basis polynomials l/x) (Equation (3)): 

f(x) = Yjfi x j) l iix) ( 3 ) 

7=1 



with 



IF*-*,) 

/,(*) = -It ( 4 ) 

IF*/ -*,) 

i = 0 

i*j 



• Cubic spline interpolation: A cubic spline is a spline 
for which the function is a polynomial of degree=3 on 
every subinterval (Equation (5)). 



fix o), 



/(*)= 



fix ,), 



xe[x 0 ,X[) 

xe[x 1; x 2 ) 



/(**- 1)> * e [**-!»**) 



( 5 ) 



where V i, the degree of/(xi) is equal to 3. 



Following the interpolation techniques described earlier, 
we have developed a new supervised classification 
method which generates, in the first step, a new set of 



3.1. Spectral and spatial characterization 

The rich spectral and spatial information available in 
hyperspectral images allows for the possibility to 
distinguish between spectrally similar materials. Various 
methods have been widely used in the literature for 
spectral and spatial characterizing hyperspectral pixels. 
For the spectral characterization, authors usually used all 
the spectral information or dimensionality reduction 
techniques like PCA and ICA to extract the most 
informative data. For the spatial features extraction, 
different means have been adopted such as: features 
provided from the neighborhood of the pixel, 
morphological filters and attribute filters. In this paper, 
we focus on the uses of all the spectral information for 
the spectral characterization and EMAP based on 
attribute filters for the spatial characterization. EMAP 
has been selected due to their capability to distinguish 
between similar objects and to model different spatial 
structures in the scene. 

EMAP [17] is a profile that stacked the EAPs obtained 
using different type of attributes. The EAP is resulted by 
generating an AP (obtained by applying a sequence of 
attribute filters using various thresholds) on each of the 
first p principal components. 

3.2. Generation of synthetic data 

One of the main problems in hyperspectral image 
classification is the presence of Hughes phenomenon 
introduced by the unbalance between the high number of 
available spectral bands and the limited availability of 
labeled training samples. To overcome this difficulty, we 
process in this paper to investigate interpolation 
techniques for generating new training samples from the 
available limited number of learning examples. Then, 
new samples can be created under curves that interpolate 
the t points presenting the set of t training samples 
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(Figure 1). 

We assume that each training sample i is presented by a / 
dimensional vector y t = ( yi yf ) combining the / 
features. Hence, generated samples must be a / 
dimensional vectors y new = ( y lnew ,..., y f new ) that 
combining the new values of the / features resulted from 
the interpolation process. For that, each feature j of the 
training sample i must be presented by a point T/V),// 
The evaluation of the function / that interpolates training 
samples points in g abscissas x j new generates the new 
samples yJ' new . Based on the aforementioned assumptions, 
we have the following generative function (Equation (6)): 

y new = f( x j mw) ( 6 ) 

In this paper, we focus on three interpolation techniques: 
linear interpolation, cubic spline interpolation and 
Lagrange interpolation. Algorithm 1 shows the 
pseudocode for the learning data generation based 
interpolation techniques. 




Cubic spline interpolation 




Lagrange interpolation 

Figure 1. Representation of generated data under interpolation curves 
('*' illustrate new samples, '°' illustrate the available data). 



X new =( x new ^ ? x new new sam pl es abscissas 

N=f(x new ) (compute new training samples according to 
the evaluation of the function f in new abscissas) 

Ynew — Y new • NJ 

end 



3.3. Spectral spatial classification via SVM with 
composite kernels 

According to their high performance registered on the 
process of data with high dimensionality, SVMs have 
been widely adopted in the classification of hyperspectral 
images. 

SVM [18] is a kernel based classifier consisting in 
projecting data in a higher dimension space by means of 
non-linear mapping function O and aiming at finding the 
optimal separator hyperplan by margin maximization. 
SVM has been proposed first for binary classification, 
after it has been introduced to solve multi-class 
classification. 

Given a labeled training data set {(xi, Li), ..., (x m , L m )}, 
where Xi e R N and Li e {-1, 1}. The SVM classifier consist 
in calculating the optimum separating hyperplan defined 
by Equation (7). 



w x +b=0 



(7) 



where (w, b) are the parameters of the hyperplan. 

Thus, the classifier can be defined as Equation (8): 

h(x,w, b)=sgn(w*x +b) (8) 



The support vectors lie on two hyperplanes of equation: 
w x +Z>=+-7. The maximization of the margin leads to 
the following optimization problem (Equation (9)): 




} with Li (w v +b) > 7, i=l ....m 



(9) 



If the training samples are not linearly separable, the 
optimization problem can be solved by using Lagrange 
multipliers Li and becomes (Equation (10)): 






2 Si 



< 0<A, : <C,Vi = l,2,...,m 
=0 



( 10 ) 



Algorithm 1: Oversampling 



Input: Y train =(yi, ...,y t ): Training samples set. 
f: dimension of vectors. 

Output: Y new =(y new i, ..., y new g ): Generated training 
samples. 



for i=l to/ do (for each feature i) 



y = Y train (i,:) "1 Coordinate of the points to 

x train = ( Xl , x t ) f be interpolated. 



y=f(x tram ) (compute the interpolation function according 
to (l)or (3) or (5)) 



with C is a regularization parameter introduced to reduce 
the weight of misclassified vectors. Thus the classifier 
function can be computed by optimization process. To 
solve this difficulty SVM can be generalized to compute 
nonlinear decision surface by projecting the data in a 
higher dimension space where they are considered to 
become linearly separable. The projection is allowed by 
using non linear function O and can be simulated using a 
kernel method. Hence, the dot products (xt.xj) involved in 
Equation (10) is replaced by K(xuXj)= 0(xi) . &(xj). 

Then the non linear classifier can be expressed as 
Equation (11): 
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f N, 



h(x, w, b)= sgn 



'£ i A,y,K(s„x )+b 
V *'=< 



( 11 ) 



with Si are the N s support vectors and K is a kernel that 
satisfies Mercer's conditions such as linear, polynomial 
and Radial Basis Function (RBF) kernels. 

Some properties of Mercer's kernel: let Ki, K 2 two valid 
Mercer's kernels and a>0, Then 
K(xi.Xj) =Kj(xi.Xj) +K2(x h Xj) and K(xuXj)= a Kj(xuXj) are a 
Mercer's kernels. 

In order to improve the classification performance 
achieved by using the spectral information alone, various 
spectral- spatial classification approaches incorporating 
the spatial information in addition to the spectral 
information have been proposed in the literature. In 
particular, the uses of SVM with composite kernels that 
combining spectral kernel for the spectral features x w 
and spatial kernel Ks for the contextual features x s has 
shown high performance in term of accuracy and 
computational time. 

Composite kernels [9] take advantage of the direct sum 
of Hilbert spaces by which two (or more) Hilbert spaces 
Hk can combine into a larger Hilbert space. This allows 
proposing tree different composite kernels: 



3.4. Supervised classification algorithm 

To summarize the description of our proposed method, 
Algorithm 2 provides a pseudocode for our newly 
developed spectral spatial supervised classification 
algorithm based on a SVM classifier with composite 
kernels and oversampling. Figure 2 shows the flowchart 
of this algorithm. 

Figure 3 illustrates a block diagram summarizing the 
most relevant steps of the newly proposed classification 
algorithm. 

Algorithm 2 SVM_oversampling 
Input: Y — ( Yl }•••} y n ) , Y train 
Output: L = (l 1 ,...,l n ) 
yspect = gp ectra i characterization(Y) 

T spect = Spectral characterization(Y train ) 

Y spat =Spatial characterization(Y) 

T i spat = Spatial characterization(Y tram ) 



• Direct summation kernel (Equation (12)): 



pspat— pspect— 



[] 



K(x t , Xj ) = K w (x; ,x ; ) + K s (xf , x) ) (12) 

for i=l to c do 



• W eighted summation kernel (Equation (13)): 

K(x t , Xj ) = pK w (x? , xj ) + (1 - p)K s (x; ,x s j) (13) 

with 0< jU< l 

• Cross-information kernel (Equation (14)) 

K(x, ,Xj) = K w (x* ,xj) + K s (xf , x* ) + K ws (xj ,x)) + 

K sw (x%xJ) (14) 

In this case spectral and spatial vector must have the 
same dimension. 

It's notable that composite kernels are Mercer's kernels. 
For that, they have been used to solve spectral-spatial 
classification with SVM. 



Y^new =oversampling( T - pea , n ) 



Y spat w =oversampling( T spat , m) 



yspect _ j- yspect Y s P ect J 

yspat J y spat yspat 

q^spect— p’spect p'spect j 



pspat— j" pspat pspat - 



end 



L= Classification( Y spect , Y spat , PP 2 *, T s P ect ) 
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Figure 2: flowchart/block diagram of the proposed classification algorithm 
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Figure 3. Block diagram summarizing the most relevant steps of the proposed classification algorithm. 



4. Experimental results 

This section uses both simulated and real hyperspectral 
data sets to illustrate the effectiveness of the proposed 
spectral spatial classification algorithm in different 
analysis scenarios. The remainder of this section is 
organized as follows. Section 4.1 first explains the 
parameter settings adopted in our experimental 
evaluation. Section 4.2 then evaluates the proposed 
algorithm by using simulated data sets, whereas Section 
4.3 evaluates the proposed classification algorithm using 
real hyperspectral image. 

4.1. Parameter Settings 

Before presenting our results with simulated and real 
hyperspectral data sets, we discuss first the parameter 
settings adopted in our experiments. In all experiments, 
we used RBF and polynomial kernels for the spectral and 
spatial features, respectively, to construct composite 
kernels. The training sets are randomly selected from the 
available labeled samples and that the remaining samples 
are used for validation. Ten Monte Carlo runs have been 
conducted, the OA (in percent)) is obtained after each 
run. The labeled samples for each Monte Carlo 
simulation are obtained by resampling the available 
learning examples. For the SVM parameters and the 
parameters used for building EMAP, we optimized the 
SVM parameters using tenfold cross-validation and we 
used the first three PCs which account for most of the 
variance present in the considered data sets to compute 
EMAP. 

4.2. Experiments with Simulated Hyperspectral Data 

In our experiments, we have used a 30x30 simulated 
hyperspectral scene containing 4 classes. Adopted 
spectral signatures are obtained from the U.S. Geological 
Survey (USGS) digital spectral library (Figure 4). EMAP 
were built using threshold value in the range of 10% - 
70% with a step of 20% for the standard deviation 
attribute and thresholds of 30, 40, 50 and 60 for the area 
attribute. In the learning step, we used new data (g 
samples) generated according to the interpolation of / 
training samples in each class. Hence, the total number of 
training samples is (l+g). 

We have conducted four different experiments with the 
simulated hyperspectral image. These experiments have 
been conducted to analyze several relevant aspects of our 
proposed classification method. 



1) In our first experiment, we evaluate the impact of 
the adopted interpolation method on the classification 
result. 

2) In our second experiment, we evaluate the impact 
of the available training set size on the classification 
result. 

3) In our third experiment, we evaluate the impact of 
the number of generated training samples on the 
classification result. 

4) In our fourth experiment, we analyze the impact of 
the parameter p adopted for weighted summation kernel 
on the classification output. 

In all these experiments, we will use the average and the 
standard deviation of classification accuracy (OA) and 
kappa coefficient (Kappa) obtained after ten Monte Carlo 
runs as a references to evaluate the performance of the 
proposed classification approach. 

_ . number of pixels correctly classified . . . . 

OA = *100 

Total number of pixels 



Kapp^ 



Nurrber of pixels correctly classified 
Nurrber of pixels correctly classified+ Nurrber of confusion 



*100 




Figure 4. Ground truth of the simulated hyperspectral scene. 



4.2.1. Experiment 1 — Impact of the applied 
interpolation method 

In this experiment, we analyze the impact of the 
interpolation techniques used to generate new training 
data. Two methods must be applied: method to 
interpolate spectral samples and another to interpolate 
spatial samples. To address this issue, we analyze the 
performance of the proposed method for different 
combination of interpolation methods in a classification 
problem with generating in each class 100 samples about 
50 available training samples and with using direct 
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summation kernel to combine spectral and spatial 
features. Figure 5 shows the OAs and kappa obtained by 
the proposed classification algorithm according to the 
applied combination. Notice the good performance 
achieved by the proposed classification algorithm, which 
yielded much better OA and kappa results (OA= 94,48% 
and kappa=97,29%) in case with linear interpolation for 
spectral samples and cubic spline interpolation for spatial 
examples. Furthermore, all the applied combinations 
allowed to have accurate results than the classification 
without oversampling (OA>OA_without= 75,86% and 
kappa> kappa_without= 85%). This is reasonable since 
the performance of the classification is increased as the 
number of training samples increases. This also indicates 
the advantage of generated training data which increases 
the capability of SVM to find the optimal separator 
hyperplan. The robustness of the proposed methods in the 
presence of very limited training sets is analyzed in more 
detail in the following experiment. 






■s $ 

y 4' r a 4' 



4 4 

r 4' 



Figure 5. Variation of OA and Kappa coefficient according to the 
variation of the adopted interpolation techniques. 



4.2.2. Experiment 2 — Impact of the available training 
set size 

In our second simulated image experiment, we analyze 
the impact of the training set size on the classification 
performance. Figure 6 (a) and (b) shows the OA and 
standard deviation (Std) results, respectively, obtained by 
our proposed methods as a function of the number of 
labeled samples (/) used in the training process with 
generating 100 samples about / according to use Lin-Spl 
combination and with using direct summation kernel. 
Notice the quality of the classification results obtained by 
our proposed algorithm, which shows high robustness 
even with very limited training set sizes. As the number 
of labeled samples increases, the OA increases and the 
standard deviation decreases. This is expected since an 
increase of the number of labeled samples should 
decrease in the uncertainty when estimating the optimal 
separator hyperplan. 
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Figure 6. Variation of OA (a) and standard deviation (b) with the 




Spectral Spatial "Direct "Weighted 

kernel kernel summation summation 

kernel" kernel 

(n=o-2 sr 

Figure 7. Resulted OA and Kappa coefficient according to the adopted 
kernel type 



4.2.3. Experiment 3 — Impact of the adopted composite 
kernel type 

In our third simulated image experiment, we analyze the 
impact of the composite kernel type on the classification 
performance. For that, we applied two different 
composite kernels which are direct summation kernel and 
weighted summation kernel. The cross-information 
kernel can't be applied because the spectral and the 
spatial vectors have different dimension. Figure 6 shows 
the OA and kappa coefficient results, respectively, 
obtained by our proposed methods as a function of 
adopted kernel type. As shown in Figure 7, the 
performance of the proposed classification algorithm 
increases when using composite kernels which combine 
spectral and spatial features. Furthermore, we can note 
that the weighted summation kernel introducing a trade- 
off (p) between spectral and spatial kernels with p=0.25 
performs more accurately than the direct summation 
kernel. 
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To address the impact of the parameter p intended to 
perform the spectral information against the spatial 
information, we analyze the performance of the proposed 
classification method for different values of p. Figure 8 
(a) and (b) shows the OA and kappa coefficient results, 
respectively, obtained by our proposed methods as a 
function of p. Notice the good performance achieved by 
the proposed classification algorithm, which yielded 
much better OA an kappa results (OA= 95,34% and 
kappa=96,6%) in cases with p=0.25. This indicates that 
the proposed method performs accurately when we 
valorize the spatial information against the spectral 
information which illustrates the importance of the 
spatial features. 




B. 

a. 

£ 






(a) 




n 



(b) 



Figure 8. Variation of OA (a) and kappa coefficient (b) according to 
the variation of p 



4.2.4. Experiment 4 — Impact of the generated training 
samples set size g 

In our last experiment with simulated data, we conduct an 
evaluation of the impact of generated data set size on the 
proposed classification algorithm by using only l = 20 
labeled samples per class and weighted summation kernel 
with p=0,25. Figure 9 (a) and (b) shows the OA and the 
kappa results as a function of generated data set size. 
From figure 8, we can conclude that the classification 
performance indeed depends on the size of generated 
training samples. With a size 75 <g< 150, the proposed 
classification method leads to good results 
(93%<OA<96,6%). This indicates that the method leads 
to high values of the OA for limited number of training 
samples (1=20). On the other hand, we have 
experimentally observed that the OA and the kappa 
results converge to low values for g>150. This indicates 
that the augmentation of g can leads to poor results lower 
than those obtained in the case of classification with 




1=20, i.e., without oversampling (OA= 83,5%, 

kappa=91%). This is because, the increase in the number 
of generated data from a limited set of available training 
samples leads to have bad training samples that confuse 
the classifier. 




Generated set size g 

(a) 




Generated set size g 



(b) 



Figure 9. Variation of OA (a) and kappa coefficient (b) according to the 
increase in generated set size. 
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4.3. Experiments With real Hyperspectral Data 
In order to evaluate the proposed classification method in 
real analysis scenarios, we use the widely used 
hyperspectral data set collected by AVIRIS "AVIRIS 
Indian Pines Data Set". The scene contains 145 x 145 
pixels and 220 spectral bands. The ground- truth data 
contains 16 mutually exclusive classes and a total of 
10366 labeled pixels. This image constitutes a 
challenging problem due to the significant presence of 
classes with similar spectral signatures and also because 
of the unbalanced number of available labeled pixels per 
class. 

To evaluate the performance of the proposed method, 
EMAP were built using threshold values in the range of 
2,5% - 10% with a step of 2,5% for the standard 
deviation attribute and thresholds of 200, 500 and 1000 
for the area attribute. In the training step, we used new 
data generated according to the interpolation of 1=50 
training samples for each class. If the total number of 
labeled samples in the reference map for a single class 
L c <50, we take 1=10. 

Table 1, illustrates the OA, AA, kappa statistic 
coefficient (k), and individual class accuracy (in percent) 
results achieved by the proposed classification method 
when we generated from / 50 new samples in each class 
and we used weighted summation kernel with p=0.25. By 
adopting oversampling, the proposed method 

significantly improved the classification results obtained 
by the considered classification without oversampling. 
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For instance, the uses of linear interpolation for the 
spectral and spatial feature (Lin_Lin) obtained an OA of 
93.21%, 4.39% larger than that obtained by SVM without 
oversampling. As a result, the obtained samples after 
oversampling play an important role in the classification 
process, it improves the accuracy of the supervised 
classifier (SVMs with composite kernel). It is remarkable 
that the uses of linear and cubic spline interpolation 
methods to oversample the available labeled samples 
leads to have more accurate classification than that 
resulted after the application of the other combinations 
between Lagrange, cubic spline and linear interpolation 
techniques. This indicate that the samples generated by 
the combination of linear and cubic spline interpolation 
are properly created refer to their similarity to the 
oversampled data in each class. 

For illustrative purposes, Figure 10 shows the ground 
truth and some of the classification results obtained by 
the different methods for the AVIRIS Indian Pines scene. 
For each method, we randomly selected one of the maps 
obtained after conducting ten Monte Carlo runs. As 



shown by Figure 10, the classification after oversampling 
by Lin_Lin combination produced the best classification 
map. An immediate issue resulting from experiments in 
Figure 9 is whether the number of training samples 
increase (classification without oversampling uses 50 
labeled samples while after oversampling the classifier 
use (50+50) labeled samples) could result in an improve 
in the classification results. In order to analyze this issue, 
we will increase the number of generated samples from 
1=50 available learning data with using Lin_Lin 
combination in the oversampling step. Figure 1 1 (a) and 
(b) shows the OA and the kappa results as a function of 
generated data set size. From figure 10, we can conclude 
that the classification performance indeed depends on the 
size of generated training samples. In fact, when 50 <g< 
200, the increasing in the size of generated data increase 
the performance of the classification but the increasing 
also of g (g>200) decrease the accuracy of the classifier. 
This indicates that the increasing of g can lead to have 
bad training samples that confuse the classifier. 




Lin_Lin (93,21%) 




Spl_Lag (89,18%) 




Lin_Lag (91,24%) 



Figure 10. Classification maps obtained by the different interpolation methods combination for the AVIRIS Indian Pines scene (overall accuracies 

are reported in parentheses). 
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Table I: Overall, average, and individual class accuracies (in percent) and k statistic obtained for the AVIRIS Indian Pines data set. 



Class 


Samples 


Classificatio 
n without 








Classification after oversampling 












overs amplin 
g 


LinSpl 


LinLin 


SplLin 


Spl_Spl 


SplLag 


LagSpl 


LagLin 


LinLa 

g 


LagLag 


Alfalfa 


54 


75 


79,55 


92,68 


70,45 


88,64 


88,64 


86,36 


80,5 


74,42 


70,45 


Corn-notill 


1434 


69,85 


77,66 


77,13 


76,53 


71,11 


70,7 


71,07 


70,1 


69,25 


71,11 


Corn-mintill 


834 


71,80 


84,43 


77,84 


80,47 


89,09 


85,06 


56,31 


69,80 


75,13 


57,13 


Corn 


234 


84,07 


88,40 


93,48 


90 


91,06 


92 


89,44 


81,45 


84,15 


90,00 


Grass/pasture 


497 


92,97 


93,17 


93,14 


90,32 


88,49 


89,5 


88,74 


82,97 


91,38 


89,50 


Grass/trees 


747 


95,03 


97,96 


96,38 


92,65 


98,26 


93,26 


95,92 


94,65 


95,80 


92,75 


Grass/pasture-mowed 


26 


87,50 


93,75 


93,33 


87,5 


87,50 


87,50 


93,33 


87,50 


93,75 


87,50 


Hay-windrowed 


489 


99,08 


97,90 


97,70 


99,07 


99,31 


98,00 


97,22 


98,00 


99,08 


97,90 


Oats 


20 


50 


100 


100 


100 


55,56 


53,50 


66,67 


75,50 


66,67 


75,50 


Soybean-no till 


968 


77,68 


81,25 


86,93 


84,33 


85,75 


78,11 


78,37 


73,14 


80,22 


81,22 


Soybean-min till 


2468 


69,40 


79,77 


81,27 


84,56 


74,23 


74,23 


67,88 


67,88 


76,18 


71,80 


Soybean-clean till 


614 


82,29 


86,23 


82,19 


80,25 


86,69 


80,04 


76,66 


82,29 


84,38 


84,38 


Wheat 


212 


100 


100 


99,38 


99,37 


99,38 


99,37 


98,76 


100 


100 


99,38 


Woods 


1294 


89,64 


90,49 


94,55 


95,05 


95,85 


89,9 


93,06 


90,23 


92,57 


89,57 


Bldg-Grass-Trees- 

Drives 


380 


88,07 


93,87 


94,17 


88,07 


84 


87,00 


80,73 


88,07 


87,20 


88,20 


Stone-Steel-Towers 


95 


100 


100 


100 


100 


95,56 


81,11 


97,73 


96,30 


86,36 


86,36 


OA 




88,82 


92,9 


93,21 


93,18 


92,28 


90,11 


89,18 


89,5 


91,24 


88,87 


AA 




83,27 


90,27 


91,26 


88,66 


86,9 


84,24 


83,64 


83,76 


84,79 


83,3 


kappa 




93,5 


94,55 


94,75 


93,04 


92,49 


91 


93,7 


93,5 


94,45 


93 
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gene rated set size 

(a) 




generated set size 



(b) 



Figure 11. Variation of OA (a) and kappa coefficient (b) according to 
the increase in generated set size. 



5. Conclusion 

In this paper, we have developed a new spectral-spatial 
supervised classification approach which combines 
spectral and spatial features via composite kernels and 
generate new training samples to overcome the problem 
of the limitation of labeled samples widely proposed in 
the classification of hyperspectral images. It investigates 
interpolation techniques to oversample the small set of 
training examples. By using weighted summation kernel 
with p=0.25 for SVMs, the proposed method provides 
good accuracies when compared with the classification 
without oversampling. It also exhibits robustness to 
different criteria, such as the limited availability of 
training samples and the type of adopted kernel. 
Although our experimental results are competitive and 
encouraging when dealing with ill-posed problems, i.e., 
limited training samples versus high dimensionality of 
the input data, further work should be focused on 
oversampling step by applied another interpolation 
techniques. 
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Abstract 

Presently, a considerable number of knowledge 
engineering researches have focused on the automatic 
building of ontologies. However, the uncertainty 
of the techniques and eventual heuristics adopted 
during the construction process has led researchers 
to explore methods for verifying and improving the 
quality of the outputs. In this intention, we propose 
a vision for checking the hierarchical structure of 
ontologies based on the WordNet lexical database as 
a background knowledge source. In order to test our 
work, we try to apply our proposed method on an 
existing valid geographic objects ontology. 

Keywords: geographic objects, knowledge model- 
ing, ontology building, Taxonomic structure, similar- 
ity measures, evaluation, quality. 

1 Introduction 

Satellite imagery is a relevant source of information for 
the identification of objects that make up the surface 
of the earth. Their exploitation in a spatio-temporal 
context helps to monitor and predict their behavior 
over time and to take appropriate decisions for the 
management of the environment. 

Indeed, the advent of high-resolution images en- 
abled the development of an object-oriented approach 
where the analysis of a scene is attached to groups 
of pixels representing concrete objects having a spe- 
cific semantic. However, this progress leads to a large 
amount of available information that cannot be pro- 
cessed in its entirety by domain experts. This mo- 
tivated the interest of research on the full or partial 
automation of the process of knowledge representa- 
tion and extraction applied to satellite imagery. To 
use geographic image databases, the researchers used 
several knowledge representation formalisms, in par- 
ticular the ontologies. 



Nowadays, ontologies are becoming very popular 
in the area of knowledge management and sharing, 
especially after the evolution of the Semantic Web. 
They are considered as one of the most powerful tools 
for knowledge representation and reasoning. They 
aim to provide a commonly accepted understanding 
of a specific domain through the generic modeling, 
the exchange and the sharing of its specific knowl- 
edge. Knowledge is modeled in the form of con- 
cepts and their relations to each other. Several stud- 
ies were interested in the use of standardized ontolo- 
gies to share and annotate satellite image information 
[7, 1, 8, 4, 26, 20, 31]. The majority of these works 
presupposes the existence of a domain ontologies that 
may be developed, or be carried out, within the target 
application [4]. However, few studies have focused on 
their evaluation or validation. 

In fact, the quality of an ontology is too sensi- 
tive to many parameters such as the consistency of 
the semantic resources from which it is built and the 
used techniques and heuristics to extract and organize 
relevant knowledge [19]. Therefore, as all engineer- 
ing artifacts, assessing the quality of ontologies still 
remains an important issue for ontology engineering. 
The evaluation covers the structure and the content of 
ontologies and allows to verify several related criteria 
such as their consistency and their adequacy to the 
user’s requirements and pre-established constraints. 

In this paper, our main research question is how to 
examine taxonomic structure of a given geographic ob- 
jects ontology. Firstly, we summarize the main eval- 
uation alternatives. Secondly, we expose our method 
and the related structural measure for verifying the 
ontology hierarchical structure based on the Word- 
Net 1 lexical database. Thirdly, we reserve the last 
section to an experimental study in which we expose 
and interpret the results of the application of our pro- 
posal on a geographic objects ontology. 



1 https:/ /wordnet. princeton.edu/ 
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2 Ontology evaluation: State of 
the art 

Evaluation is a crucial phase in the building process 
of ontologies. It helps to simplify their development, 
to ensure their relevance to the requirement of a 
particular domain and to detect eventual ontology 
changes. However, the lack of unifying framework for 
methods and metrics for evaluating ontologies have 
led to several trials, each of which defines its own 
method and set of metrics. In this section, we try 
to summarize the main evaluation methods that can 
be classified according to their purpose into three 
categories: ranking, correctness, or quality. 

When trying to reuse the already existing ono- 
tologies for a particular study domain, we are faced 
with the problem of determining the suitable ones 
for our needs. In this context [17] have presented an 
approach for clustering ontology. The main goal of 
this approach is to use a set of similarity measures 
for comparing ontology-based meta-data. Based on 
this work, [27] have developed the OntoQA approach 
that analyzes ontology schemas and their populations 
and describes them through a well defined set of 
schema and meta-data metrics. The first group 
includes the diagram metrics of ontologies, whose 
intention is to evaluate the ontology design and its 
potential for knowledge representation. The second 
group is interested in evaluating the structure of the 
knowledge base and more specifically how data is 
placed in ontology. 

Further, the ranking category includes approaches 
for ranking and selecting ontologies. These ap- 
proaches allow ranking a set of candidate ontologies 
in order to choose the most appropriate for a par- 
ticular task. Ontometric [15] is one of the main 
used methods for systematic ontology selection, it 
aims to suggest the best ontology for a particular 
project on the basis of 160 properties organized 
in five dimensions of quantitative measurements: 
content, language, methodology, tool and costs. [21], 
have provided a corpus-based method to evaluate the 
functional adequacy of ontologies. [22] have proposed 
an ontology selection and ranking model consisting 
of selection standards and metrics based on better 
semantic matching capabilities. The proposed model 
allows to enhance the ontology selection and rank- 
ing method practically and effectively by enabling 
semantic matching of taxonomy or relational linkage 
between concepts and to identify what measures 
should be used to rank ontologies in a given context 
and what weight should be assigned to each selection 
measure. FOEval [3] is another model which presents 
two main features: first, it enables users to select 
from a set of proposed metrics, those which they 
help in the ontology evaluation process; and to assign 



weights to each one based on assumed impacts on this 
process. Second, it enables users to evaluate locally 
stored ontologies, and/or request search engines for 
available ontologies. The main goal of this model is 
to ease the ontology evaluation task, for users wishing 
to reuse available ontologies, enabling them to choose 
the most adequate ontology to their requirements. 
To evaluate and rank candidate ontologies, FOEval 
use a set of metrics that include: coverage, richness, 
detail-level, comprehensiveness, connectedness and 
computational efficiency. 

The correctness category includes the approaches 
accounting for the formal correctness of the ontologi- 
cal knowledge and used primitives. In this category, 
the best known approach is Ontoclean [12] which is 
designed in order to justify the kinds of decisions 
that experienced ontology builders make and to 
explain the common mistakes of the inexperienced, 
as it analyses the intentional content of concepts. 
It is based on principles of rigidity, identity, unity 
and dependence. Based on this method, [5] have 
developed a framework which looks for taxonomic 
aspects such as circularity and redundancy, as well as 
errors in disjoint groups. [28] have developed another 
tool for evaluating real-world ontologies. [30] have 
proposed a tool that evaluates correctness, where an 
internal evaluation is performed, based on the correct 
usage of OWL primitives. 

The third category addresses the evaluation of the 
global quality of ontology. Following this approach, 
the EvaLexon method [25] aims to evaluate the 
ontologies during their development from texts. It 
measures the most appropriate terms in ontology. 
The relevance of a term is judged by its frequency in 
the text from which the ontology was built and the 
list of terms for a specific domain. The evaluation 
is based on four metrics: precision, recall, coverage 
and accuracy. In turn, [9] have approached the 
ontology evaluation as a diagnostic task based on on- 
tology descriptions, using three categories of criteria: 
structural (depth, breadth, tangledness, dispersion, 
consistency, anonymous classes, cycles, and density), 
functional (competence adequacy, functional modu- 
larity, precision, recall and accuracy), and usability 
profiling (documentation, efficiency, interfacing). 
By combining the different measurable criteria for 
each category, nine quality principles (qoods) are 
defined: cognitive ergonomics, transparency, integrity 
and computational efficiency, met a- level integrity, 
flexibility, expertise compliance, conformity with 
extension, integration and adaptation procedures, 
generic access and organizational ability. 

To assess the quality of evolving ontologies, [16] 
have proposed a set of cohesion metrics that are con- 
sidered as stable, where their results do not depend on 
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the semantic or structural ontology representation. In 
the same way, [6] proposed the Onto-Evoal approach 
which is based on an evaluation model to guide the 
management of inconsistencies by assessing the im- 
pact of proposed resolutions on the content and use 
of the ontology. This model defines a set of quan- 
titative metrics allowing choosing a resolution that 
preserves the quality of the evolved ontology. Qual- 
ity criteria considered in the proposed approach are: 
complexity, cohesion, taxonomy, abstraction, modu- 
larity, completeness and understanding. By referring 
to the work of [10] and [11], [29] presents a theoretical 
framework for assessing the quality of an ontology for 
the Web. The framework summarizes ontology eval- 
uation methods in two dimensions: ontology quality 
criteria (accuracy, adaptability, clarity, completeness, 
computational efficiency, conciseness, consistency, and 
organizational fitness) and ontology aspects (vocabu- 
lary, syntax, structure, semantics, representation, and 
context). Building on the two large meta-properties of 
unity and simplicity, [2] have developed an evaluation 
methodology called OntoAbsolute that allows to as- 
sess the taxonomic and non-taxonomic relationships, 
analyzes the conceptual structure and evaluates the 
ontology as a whole. 

3 Proposed evaluation method 

Our method of analysis of the taxonomic consistency 
of ontologies is based on two key elements (1) the 
projection of the ontology to evaluate on WordNet, 
and (2) the checking of the conformity of its hier- 
archical links compared to those linking WordNet 
corresponding synsets. 



attention to the polysemy problems. Indeed, for a 
given concept identifier, WordNet has multiple possi- 
ble nodes, each of which is part of a particular context 
and refers to a different signification. Consequently, 
the good location of a concept in WordNet returns to 
find the synset that reflects its exact meaning. 

3.1 Projection of ontology on Word- 
Net 

The aim of this step is to locate the concepts of 
our ontology in WordNet that serves as a reference 
support for the analysis and validation of the ontology 
taxonomic structure. 

For doing this, we are led to find for each 
concept the corresponding WordNet synset. It is 
obvious that this treatment can not be limited to 
a simple term search of the concept identifier in 
WordNet. knowing that the same word can support 
multiple meanings. Therefore, to be able to map a 
given concept in WordNet, we need to distinguish, 
among all proposed synsets, the one that better 
corresponds. Our solution is to involve the context 
of the concept in its marking task in WordNet. 
The context of a concept is described by its iden- 
tifier, labels, comments, neighborhood and properties. 

The most appropriate synset for a given concept is 
the one that shares with it the maximum of knowledge 
in terms of neighborhood and textual descriptions. 

Figure 1: Mapping between ontology concepts and 
WordNet synsets 



WordNet is an on-line lexical database that lists, 
classifies and connects in various ways the semantic 
and lexical content of a number of languages such 
as English and French [18]. For each word of the 
language, WordNet offers a list of synsets (synonym 
set) that correspond to all its possible meanings. 
The synset is the building block upon which rests 
the entire system. It corresponds to a group of 
interchangeable words denoting one sense or a 
particular purpose. Different words and synsets are 
interconnected by a number of lexical relations as 
the hyponymy/hyperonymy, holonymy/meronymy 
and synonymy/ ant ony my. These relationships can 
be exploited to explore the exact meaning of a given 
word. Its third release 2 offers a number of 155287 
words expressing 117659 different meanings (synset). 

These values reveal the semantic richness of Word- 
Net and enhances the utility of its use as a reference 
taxonomy in order to verify the structure of ontolo- 
gies. However, its generic nature assign a special 



Given the following: 

W (F, S ) which defines the vocabulary admitted 
by WordNet corresponding to a set of pairs (F, 5), 
where F is the form of a string on a finite alphabet 
and S = {s/F} is the set of senses supported by F. 
s denotes an element of the set of meanings S (i.e. a 
synset). 

Let the function P(c, Si) (Equation 1) defines the 
degree of knowledge sharing between the concept c 
and the synset Si that denotes the synset number i 
of the identifier name of c. The relevant synset to a 
concept c must check this commitment: 

s k = relSyn(c) => P(c, s k ) > P(c, sj) Vs fe ^ sj (1) 

The function P is described by the following algo- 
rithm: 

• syn(c ) a function that returns the synsets re- 
lated to the identifier of the concept c. 



2 htt p : / / wordnet .princeton . edu / wordnet / man / wnst ats . 7WN . html 



• lab(c ) a function that returns the set of labels of 
the concept c. 
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• com(c ) a function that returns the comments 
associated with the concept c. 

• super (c) a function that returns the direct sub- 
sumer of the concept c. 

• w_com(s) a function that returns the significant 
words included in the comments associated with 
the concept c. 

• w_syn(s ) a function that returns the set of syn- 
onyms words related to the synset s. 

• w _gloss(s) a function that returns the signifi- 
cant words that compose the definition associ- 
ated to the synset s. 

and 



3.2 Validation of ontology taxonomic 
structure 

Once the concepts of the ontology to be evaluated are 
mapped with the WordNet synsets, it is now possible 
to check the compatibility between the taxonomic 
structure of the ontology and that of corresponding 
synsets. 

The hypothesis on which we base our assessment 
is that a given subsomption relationship between two 
concepts is considered valid only if their correspond- 
ing synsets are connected by the shortest hyperonymy 
path compared to those linking the synset of the 
subsuming concept to all synsets associated with the 
other concepts. 



For extracting the relevant synset to the root con- 
cept of our ontology, we can proceed as follows : 

• If the concept has a single synset in WordNet, 
it is then the corresponding synset. 

• If the concept has labels, the sharing degree be- 
tween it and a given synset is described by the 
intersection of their respective labels and syn- 
onyms. 

• If the concept has comments, the sharing de- 
gree between it and a given synset is described 
by the intersection of their respective comments 
and definitions. 

• Otherwise, the selection can be done manually 
(only for the root concept). 

Input : c (a concept name identifier) 

BEGIN 

if (I syn(c) |= 1) 

P = 1 

else if (| lab(c) |> 0) 

P =\ lab(c) PI w_syn(s) | / | lab(c) \ 
else if (| com(c) |> 0) 

P =| w_com(c) PI w_gloss(s) | / | w _com(c) \ 
else return -1 (Cannot locate the root c in WordNet) 

END. 

However, the identification of the corresponding 
synset for a given non-root concept c is based on 
the computation of the distance that separates this 
synset to that associated with the closest subsumer 
of c in the ontology to be evaluated. We assume 
that the relevant synset Sk to a given concept is the 
one that is connected with the smallest number of 
subsumption links; among other synsets Si of the 
same concept; to the corresponding synset of its 
subsumer. In this situation, the degree of knowledge 
sharing is described by the formula 2. 



P(c, Si) 



1 

distance{si , relSyn ( super ( c) ) 



( 2 ) 



Several graph-theoretic measures can be used to 
calculate the proximity between two synsets in Word- 
Net. They are mainly based on the number of edges 
that separate two nodes in a taxonomy. The most 
commonly used measures in literature are Rada [23], 
Leacock & Chodorow [14], Hirst & St-Onge [13] and 
Wu & Palmer [32] . Rada measure is considered as the 
most obvious way to evaluate the semantic similarity 
in a hierarchical ontology. It corresponds to the short- 
est path between two concepts in an ontology where 
only taxonomic links are considered, i.e. hyperonymy 
and hyponymy. Leacock & Chodorow measure is an 
extension of Rada which is in fact normalized by intro- 
ducing a division by the maximum hierarchy depth of 
the involved concepts. Path measure adopts the same 
principle as the previous two measures by considering 
the inverse of the number of nodes along the shortest 
path between two nodes. As for Hirst & St-Onge mea- 
sure, similarity between two concepts is determined by 
the minimum number of direction changes of the path 
between the two concepts. Indeed, depending on this 
measure, we distinguish four relation types between 
two concepts which are extra-strong, strong, medium 
and week. Wu & Palmer measure evaluates the sim- 
ilarity between two concepts as the distance of their 
most specific common subsumer to the root of the on- 
tology divided by the shortest path between them. 



4 Experimentation 

In this section, we expose and interpret the results of 
applying our proposed evaluation method on a part 
of the ontology of AKTtiveSA 3 . This ontology deals 
with a number of geographical aspects of the knowl- 
edge infrastructure for humanitarian and disaster re- 
lief operations. It encompasses a wide variety of con- 
ceptualizations including terrain features, transport 
routes, rivers, shorelines, terrain elevation data, etc. 
[24]. The part to which we will limit our experimen- 



^http:/ /www. zaltys.net/ontology/AKTiveSAOntology.owl 
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tal study represents a hierarchy of 23 concepts model- 
ing some Earth hydrographic objects (Table 2). Our 
scope of analysis will be restricted to concepts whose 
names appear in WordNet. 

Figure 2: Taxonomic structure of the AKTiveSA on- 
tology 

As indicated above, the rapprochement between 
the concepts of the ontology to evaluate and the Word- 
Net synsets may be supported by the texts associated 
with them, but also by their subsumers in both hier- 
archies. The concepts of our ontology lack any label. 
Table 1 shows the associated comments for each con- 
cept of the analyzed part. 



Table 1: AKTiveSA concepts and relative comments 



Concept 


Comments 


Body of 

water 


Represents planetary structures that are 
part of the hydrosphere and that have a 
primary substance composition of a water. 


Aquifer 


An aquifer is an underground structure of 
water-bearing, permeable rock. 


Reservoir 




Pond 


A pond is a body of water smaller than 
a lake. However the difference between a 
pond and a lake is largely subjective. The 
term pond usually describes small bodies 
of water, generally smaller than one would 
require a boat to cross. Another definition 
is that a pond is a body of water where 
even its deepest areas are reached by sun- 
light. 


Lake 


A lake is a body of water surrounded by 
land. 


Stream 


A stream is a body of water with a de- 
tectable current, confined within a bed 
and banks. Stream is also an umbrella 
term used in the scientific community for 
all flowing natural waters. 


River 


A river is a large stream, which may also 
be a water way. 


Canal 


Canals are man-made waterways, usu- 
ally connecting existing lakes, rivers, or 
oceans. Irrigation canals are man-made 
waterways for the delivery of water and 
preceded the use of transportation canals 
used by barges or narrowboats on smaller 
canals, and by ships on ship canals that 
connect to the ocean. 


Creek 


In British English and Indian English us- 
age, a creek is a tidal water channel. 
Creeks may often dry to a muddy channel 
with little or no flow at low tide, but of- 
ten with significant depth of water at high 
tide. 


Spring 


A spring is a point where groundwater 
flows out of the ground, and is thus where 
the aquifer surface meets the ground sur- 
face. 


Ocean 


A large body of water constituting a prin- 
cipal part of the hydrosphere. 



The results of the evaluation of the structural 
proximity between each concept and each of its cor- 
responding synsets are given in Table 2. For each of 
these concepts, we indicate the Path similarity be- 
tween each of its related synsets and the synset that 



corresponds to its closest subsumer concept in the 
AKTiveSA ontology. The most relevant synset for 
a given concept is that having the highest Path simi- 
larity value (written in bold). 



Table 2: The taxonomic proximity values between 
concepts and related synsets. 



concept 


#n#l 


#n#2 


#n#3 


#n#4 


#n#5 


#n#6 


Pond 


0.33 


- 


- 


- 


- 


- 


Aquifer 


0.16 


- 


- 


- 


- 


- 


Lake 


0.5 


0.11 


0.11 


- 


- 


- 


Stream 


0.5 


0.09 


0.08 


0.11 


0.08 


- 


Ocean 


0.5 


0.11 


- 


- 


- 


- 


Reservoir 


0.1 


0.5 


0.35 


0.25 


- 


- 


Canal 


0.25 


0.12 


0.10 


- 


- 


- 


Creek 


0.5 


0.11 


- 


- 


- 


- 


River 


0.5 


- 


- 


- 


- 


- 


Spring 


0.09 


0.09 


0.14 


0.11 


0.09 


0.08 



Excepting the case of the concept Canal , the con- 
frontation of the results detailed in Table 2 with the 
definitions of the concepts (Table 1) and their rele- 
vant synsets (Table 4) proves the effectiveness of our 
approach to identify concepts in WordNet. We clearly 
notice that the definitions of the AKTiveSA concepts 
are highly compatible with the glosses related to found 
synsets. Furthermore, the localization of concepts in 
WordNet helps to better understand their contexts. 
The example in Table 3 reinforces this idea and shows 
that, among the three synsets related to the concept 
Canal , the first synset has the highest structural simi- 
larity compared to representative sysnets of the other 
concepts. 



Table 3: The Path similarity between the synsets of 
canal and the other concept synsets 





b. water 
#n#l 


aquifer 
# n#l 


reservoir 

#n#2 


pond 
# n#l 


lake 

#n#l 


canal#n#l 


0.33 


0.12 


0.20 


0.20 


0.25 


canal#n#2 


0.14 


0.10 


0.11 


0.11 


0.12 


canal#n#3 


0.11 


0.12 


0.09 


0.09 


0.10 




stream 
# n#l 


river 

#n#l 


creek 

#n#l 


spring 

#n#3 


ocean 

#n#l 


canal#n#l 


0.25 


0.20 


0.20 


0.12 


0.25 


canal#n#2 


0.12 


0.11 


0.11 


0.10 


0.12 


canal#n#3 


0.10 


0.09 


0.09 


0.12 


0.10 



On the other hand, by browsing through the 
glosses of the three synsets of Canal , it seems clearly 
that the definition }} long and narrow strip of water 
made for boats or for irrigation" associated with the 
third synset is more appropriate to the context of 
geographic objects than the first synset. A simple 
computation of the terminological intersection be- 
tween the concept comments and both associated 
synset glosses can reinforce this attitude and allows 
us to conclude that this definition is the closest to 
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